Hi all,
This series implements the Permission Overlay Extension introduced in 2022 VMSA enhancements [1]. It is based on v6.6-rc3.
The Permission Overlay Extension allows to constrain permissions on memory regions. This can be used from userspace (EL0) without a system call or TLB invalidation.
POE is used to implement the Memory Protection Keys [2] Linux syscall.
The first few patches add the basic framework, then the PKEYS interface is implemented, and then the selftests are made to work on arm64.
There was discussion about what the 'default' protection key value should be, I used disallow-all (apart from pkey 0), which matches what x86 does.
Patch 15 contains a call to cpus_have_const_cap(), which I couldn't avoid until Mark's patch to re-order when the alternatives were applied [3] is committed.
The KVM part isn't tested yet.
I have tested the modified protection_keys test on x86_64 [4], but not PPC.
Hopefully I have CC'd everyone correctly.
Thanks, Joey
Joey Gouly (20): arm64/sysreg: add system register POR_EL{0,1} arm64/sysreg: update CPACR_EL1 register arm64: cpufeature: add Permission Overlay Extension cpucap arm64: disable trapping of POR_EL0 to EL2 arm64: context switch POR_EL0 register KVM: arm64: Save/restore POE registers arm64: enable the Permission Overlay Extension for EL0 arm64: add POIndex defines arm64: define VM_PKEY_BIT* for arm64 arm64: mask out POIndex when modifying a PTE arm64: enable ARCH_HAS_PKEYS on arm64 arm64: handle PKEY/POE faults arm64: stop using generic mm_hooks.h arm64: implement PKEYS support arm64: add POE signal support arm64: enable PKEY support for CPUs with S1POE arm64: enable POE and PIE to coexist kselftest/arm64: move get_header() selftests: mm: move fpregs printing selftests: mm: make protection_keys test work on arm64
Documentation/arch/arm64/elf_hwcaps.rst | 3 + arch/arm64/Kconfig | 1 + arch/arm64/include/asm/el2_setup.h | 10 +- arch/arm64/include/asm/hwcap.h | 1 + arch/arm64/include/asm/kvm_host.h | 4 + arch/arm64/include/asm/mman.h | 6 +- arch/arm64/include/asm/mmu.h | 2 + arch/arm64/include/asm/mmu_context.h | 51 ++++++- arch/arm64/include/asm/pgtable-hwdef.h | 10 ++ arch/arm64/include/asm/pgtable-prot.h | 8 +- arch/arm64/include/asm/pgtable.h | 28 +++- arch/arm64/include/asm/pkeys.h | 110 ++++++++++++++ arch/arm64/include/asm/por.h | 33 +++++ arch/arm64/include/asm/processor.h | 1 + arch/arm64/include/asm/sysreg.h | 16 ++ arch/arm64/include/asm/traps.h | 1 + arch/arm64/include/uapi/asm/hwcap.h | 1 + arch/arm64/include/uapi/asm/sigcontext.h | 7 + arch/arm64/kernel/cpufeature.c | 15 ++ arch/arm64/kernel/cpuinfo.c | 1 + arch/arm64/kernel/process.c | 16 ++ arch/arm64/kernel/signal.c | 51 +++++++ arch/arm64/kernel/traps.c | 12 +- arch/arm64/kvm/sys_regs.c | 2 + arch/arm64/mm/fault.c | 44 +++++- arch/arm64/mm/mmap.c | 7 + arch/arm64/mm/mmu.c | 38 +++++ arch/arm64/tools/cpucaps | 1 + arch/arm64/tools/sysreg | 11 +- fs/proc/task_mmu.c | 2 + include/linux/mm.h | 11 +- .../arm64/signal/testcases/testcases.c | 23 --- .../arm64/signal/testcases/testcases.h | 26 +++- tools/testing/selftests/mm/Makefile | 2 +- tools/testing/selftests/mm/pkey-arm64.h | 138 ++++++++++++++++++ tools/testing/selftests/mm/pkey-helpers.h | 8 + tools/testing/selftests/mm/pkey-powerpc.h | 3 + tools/testing/selftests/mm/pkey-x86.h | 4 + tools/testing/selftests/mm/protection_keys.c | 29 ++-- 39 files changed, 685 insertions(+), 52 deletions(-) create mode 100644 arch/arm64/include/asm/pkeys.h create mode 100644 arch/arm64/include/asm/por.h create mode 100644 tools/testing/selftests/mm/pkey-arm64.h
Add POR_EL{0,1} according to DDI0601 2023-03.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/sysreg.h | 13 +++++++++++++ arch/arm64/tools/sysreg | 8 ++++++++ 2 files changed, 21 insertions(+)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 38296579a4fd..cc2d61fd45c3 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -994,6 +994,19 @@
#define PIRx_ELx_PERM(idx, perm) ((perm) << ((idx) * 4))
+/* + * Permission Overlay Extension (POE) permission encodings. + */ +#define POE_NONE UL(0x0) +#define POE_R UL(0x1) +#define POE_X UL(0x2) +#define POE_RX UL(0x3) +#define POE_W UL(0x4) +#define POE_RW UL(0x5) +#define POE_XW UL(0x6) +#define POE_RXW UL(0x7) +#define POE_MASK UL(0xf) + #define ARM64_FEATURE_FIELD_BITS 4
/* Defined for compatibility only, do not add new users. */ diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 76ce150e7347..20f7dd25c221 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -2504,6 +2504,14 @@ Sysreg PIR_EL2 3 4 10 2 3 Fields PIRx_ELx EndSysreg
+Sysreg POR_EL0 3 3 10 2 4 +Fields PIRx_ELx +EndSysreg + +Sysreg POR_EL1 3 0 10 2 4 +Fields PIRx_ELx +EndSysreg + Sysreg LORSA_EL1 3 0 10 4 0 Res0 63:52 Field 51:16 SA
On Wed, Sep 27, 2023 at 03:01:04PM +0100, Joey Gouly wrote:
Add POR_EL{0,1} according to DDI0601 2023-03.
Reviewed-by: Mark Brown broonie@kernel.org
Add E0POE bit that traps accesses to POR_EL0 from EL0. Updated according to DDI0601 2023-03.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/tools/sysreg | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 20f7dd25c221..a728176786cf 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1741,7 +1741,8 @@ Field 0 M EndSysreg
SysregFields CPACR_ELx -Res0 63:29 +Res0 63:30 +Field 29 E0POE Field 28 TTA Res0 27:26 Field 25:24 SMEN
On Wed, Sep 27, 2023 at 03:01:05PM +0100, Joey Gouly wrote:
Add E0POE bit that traps accesses to POR_EL0 from EL0. Updated according to DDI0601 2023-03.
Reviewed-by: Mark Brown broonie@kernel.org
It's also up to date with DDI 0601 2023-09 it seems.
This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE as the boot CPU will enable POE if it has it, so secondary CPUs must also have this feature.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/kernel/cpufeature.c | 7 +++++++ arch/arm64/tools/cpucaps | 1 + 2 files changed, 8 insertions(+)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 444a73c2e638..902885f59396 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2719,6 +2719,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP) }, + { + .desc = "Stage-1 Permission Overlay Extension (S1POE)", + .capability = ARM64_HAS_S1POE, + .type = ARM64_CPUCAP_BOOT_CPU_FEATURE, + .matches = has_cpuid_feature, + ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP) + }, {}, };
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index c3f06fdef609..b8348e40f6d9 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -43,6 +43,7 @@ HAS_NO_FPSIMD HAS_NO_HW_PREFETCH HAS_PAN HAS_S1PIE +HAS_S1POE HAS_RAS_EXTN HAS_RNG HAS_SB
Allow EL0 or EL1 to access POR_EL0 without being trapped to EL2.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/el2_setup.h | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h index b7afaa026842..df5614be4b70 100644 --- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -184,12 +184,20 @@ .Lset_pie_fgt_@: mrs_s x1, SYS_ID_AA64MMFR3_EL1 ubfx x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4 - cbz x1, .Lset_fgt_@ + cbz x1, .Lset_poe_fgt_@
/* Disable trapping of PIR_EL1 / PIRE0_EL1 */ orr x0, x0, #HFGxTR_EL2_nPIR_EL1 orr x0, x0, #HFGxTR_EL2_nPIRE0_EL1
+.Lset_poe_fgt_@: + mrs_s x1, SYS_ID_AA64MMFR3_EL1 + ubfx x1, x1, #ID_AA64MMFR3_EL1_S1POE_SHIFT, #4 + cbz x1, .Lset_fgt_@ + + /* Disable trapping of POR_EL0 */ + orr x0, x0, #HFGxTR_EL2_nPOR_EL0 + .Lset_fgt_@: msr_s SYS_HFGRTR_EL2, x0 msr_s SYS_HFGWTR_EL2, x0
POR_EL0 is a register that can be modified by userspace directly, so it must be context switched.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/processor.h | 1 + arch/arm64/include/asm/sysreg.h | 3 +++ arch/arm64/kernel/process.c | 16 ++++++++++++++++ 3 files changed, 20 insertions(+)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index e5bc54522e71..b3ad719c2d0c 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -179,6 +179,7 @@ struct thread_struct { u64 sctlr_user; u64 svcr; u64 tpidr2_el0; + u64 por_el0; };
static inline unsigned int thread_get_vl(struct thread_struct *thread, diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index cc2d61fd45c3..0dc8ee423af4 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -1007,6 +1007,9 @@ #define POE_RXW UL(0x7) #define POE_MASK UL(0xf)
+/* Initial value for Permission Overlay Extension for EL0 */ +#define POR_EL0_INIT UL(0x7) + #define ARM64_FEATURE_FIELD_BITS 4
/* Defined for compatibility only, do not add new users. */ diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 0fcc4eb1a7ab..d33f9717bfcd 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -271,12 +271,19 @@ static void flush_tagged_addr_state(void) clear_thread_flag(TIF_TAGGED_ADDR); }
+static void flush_poe(void) +{ + if (cpus_have_final_cap(ARM64_HAS_S1POE)) + write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0); +} + void flush_thread(void) { fpsimd_flush_thread(); tls_thread_flush(); flush_ptrace_hw_breakpoint(current); flush_tagged_addr_state(); + flush_poe(); }
void arch_release_task_struct(struct task_struct *tsk) @@ -498,6 +505,14 @@ static void erratum_1418040_new_exec(void) preempt_enable(); }
+static void permission_overlay_switch(struct task_struct *next) +{ + if (alternative_has_cap_unlikely(ARM64_HAS_S1POE)) { + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0); + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0); + } +} + /* * __switch_to() checks current->thread.sctlr_user as an optimisation. Therefore * this function must be called with preemption disabled and the update to @@ -533,6 +548,7 @@ struct task_struct *__switch_to(struct task_struct *prev, ssbs_thread_switch(next); erratum_1418040_thread_switch(next); ptrauth_thread_switch_user(next); + permission_overlay_switch(next);
/* * Complete any pending TLB or cache maintenance on this CPU in case
On Wed, Sep 27, 2023 at 03:01:08PM +0100, Joey Gouly wrote:
+static void permission_overlay_switch(struct task_struct *next) +{
- if (alternative_has_cap_unlikely(ARM64_HAS_S1POE)) {
current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
write_sysreg_s(next->thread.por_el0, SYS_POR_EL0);
- }
+}
Does this need an ISB or is the POR_EL0 register write self-synchronising?
On Wed, Sep 27, 2023 at 03:01:08PM +0100, Joey Gouly wrote:
+/* Initial value for Permission Overlay Extension for EL0 */ +#define POR_EL0_INIT UL(0x7)
Might be useful to explain why this is the default (and possibly also define in terms of the constants for POR values)?
+static void permission_overlay_switch(struct task_struct *next) +{
- if (alternative_has_cap_unlikely(ARM64_HAS_S1POE)) {
Why the _unlikely here?
Hi Mark,
On Thu, Oct 05, 2023 at 03:14:50PM +0100, Mark Brown wrote:
On Wed, Sep 27, 2023 at 03:01:08PM +0100, Joey Gouly wrote:
+/* Initial value for Permission Overlay Extension for EL0 */ +#define POR_EL0_INIT UL(0x7)
Might be useful to explain why this is the default (and possibly also define in terms of the constants for POR values)?
+static void permission_overlay_switch(struct task_struct *next) +{
- if (alternative_has_cap_unlikely(ARM64_HAS_S1POE)) {
Why the _unlikely here?
The only options are alternative_has_cap_{likely,unlikely}(). I went with unlikely as currently it is unlikely!
Thanks, Joey
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Tue, Oct 10, 2023 at 10:54:23AM +0100, Joey Gouly wrote:
On Thu, Oct 05, 2023 at 03:14:50PM +0100, Mark Brown wrote:
On Wed, Sep 27, 2023 at 03:01:08PM +0100, Joey Gouly wrote:
- if (alternative_has_cap_unlikely(ARM64_HAS_S1POE)) {
Why the _unlikely here?
The only options are alternative_has_cap_{likely,unlikely}(). I went with unlikely as currently it is unlikely!
Oh, if you have to pick one it doesn't really matter and that seems sensible. Usually they're optional so outside of very hot paths they tend to be an antipattern.
Define the new system registers that POE introduces and context switch them.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Oliver Upton oliver.upton@linux.dev Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/kvm_host.h | 4 ++++ arch/arm64/kvm/sys_regs.c | 2 ++ 2 files changed, 6 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index af06ccb7ee34..205653bc5c2c 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -365,6 +365,10 @@ enum vcpu_sysreg { PIR_EL1, /* Permission Indirection Register 1 (EL1) */ PIRE0_EL1, /* Permission Indirection Register 0 (EL1) */
+ /* Permission Overlay Extension registers */ + POR_EL1, /* Permission Overlay Register 1 (EL1) */ + POR_EL0, /* Permission Overlay Register 0 (EL0) */ + /* 32bit specific registers. */ DACR32_EL2, /* Domain Access Control Register */ IFSR32_EL2, /* Instruction Fault Status Register */ diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index e92ec810d449..dbaddadd2a1c 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2124,6 +2124,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 }, { SYS_DESC(SYS_PIRE0_EL1), access_vm_reg, reset_unknown, PIRE0_EL1 }, { SYS_DESC(SYS_PIR_EL1), access_vm_reg, reset_unknown, PIR_EL1 }, + { SYS_DESC(SYS_POR_EL1), access_vm_reg, reset_unknown, POR_EL1 }, { SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
{ SYS_DESC(SYS_LORSA_EL1), trap_loregion }, @@ -2203,6 +2204,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { { PMU_SYS_REG(PMOVSSET_EL0), .access = access_pmovs, .reg = PMOVSSET_EL0 },
+ { SYS_DESC(SYS_POR_EL0), access_vm_reg, reset_unknown, POR_EL0 }, { SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 }, { SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 }, { SYS_DESC(SYS_TPIDR2_EL0), undef_access },
Hi Joey,
On Wed, Sep 27, 2023 at 03:01:09PM +0100, Joey Gouly wrote:
Define the new system registers that POE introduces and context switch them.
I'm not seeing the latter part in the diff. I was anticipating that the POE context switching hooks into __sysreg_{save,restore}_{el1,user}_state() like we do for most other bits of sysreg context.
-- Thanks, Oliver
Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to check if the CPU supports the feature.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- Documentation/arch/arm64/elf_hwcaps.rst | 3 +++ arch/arm64/include/asm/hwcap.h | 1 + arch/arm64/include/uapi/asm/hwcap.h | 1 + arch/arm64/kernel/cpufeature.c | 8 ++++++++ arch/arm64/kernel/cpuinfo.c | 1 + 5 files changed, 14 insertions(+)
diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst index 76ff9d7398fd..85f6e9babc7f 100644 --- a/Documentation/arch/arm64/elf_hwcaps.rst +++ b/Documentation/arch/arm64/elf_hwcaps.rst @@ -308,6 +308,9 @@ HWCAP2_MOPS HWCAP2_HBC Functionality implied by ID_AA64ISAR2_EL1.BC == 0b0001.
+HWCAP2_POE + Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001. + 4. Unused AT_HWCAP bits -----------------------
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h index 521267478d18..196f21b7d11b 100644 --- a/arch/arm64/include/asm/hwcap.h +++ b/arch/arm64/include/asm/hwcap.h @@ -139,6 +139,7 @@ #define KERNEL_HWCAP_SME_F16F16 __khwcap2_feature(SME_F16F16) #define KERNEL_HWCAP_MOPS __khwcap2_feature(MOPS) #define KERNEL_HWCAP_HBC __khwcap2_feature(HBC) +#define KERNEL_HWCAP_POE __khwcap2_feature(POE)
/* * This yields a mask that user programs can use to figure out what diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h index 53026f45a509..8809ff35d6e4 100644 --- a/arch/arm64/include/uapi/asm/hwcap.h +++ b/arch/arm64/include/uapi/asm/hwcap.h @@ -104,5 +104,6 @@ #define HWCAP2_SME_F16F16 (1UL << 42) #define HWCAP2_MOPS (1UL << 43) #define HWCAP2_HBC (1UL << 44) +#define HWCAP2_POE (1UL << 46)
#endif /* _UAPI__ASM_HWCAP_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 902885f59396..9b9145fdb208 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -400,6 +400,7 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = { };
static const struct arm64_ftr_bits ftr_id_aa64mmfr3[] = { + ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_S1POE_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_S1PIE_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_TCRX_SHIFT, 4, 0), ARM64_FTR_END, @@ -2220,6 +2221,12 @@ static void cpu_enable_mops(const struct arm64_cpu_capabilities *__unused) sysreg_clear_set(sctlr_el1, 0, SCTLR_EL1_MSCEn); }
+static void cpu_enable_poe(const struct arm64_cpu_capabilities *__unused) +{ + sysreg_clear_set(REG_TCR2_EL1, 0, TCR2_EL1x_E0POE); + sysreg_clear_set(CPACR_EL1, 0, CPACR_ELx_E0POE); +} + /* Internal helper functions to match cpu capability type */ static bool cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap) @@ -2724,6 +2731,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .capability = ARM64_HAS_S1POE, .type = ARM64_CPUCAP_BOOT_CPU_FEATURE, .matches = has_cpuid_feature, + .cpu_enable = cpu_enable_poe, ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP) }, {}, diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 98fda8500535..5b44e8ab9ab8 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -127,6 +127,7 @@ static const char *const hwcap_str[] = { [KERNEL_HWCAP_SME_F16F16] = "smef16f16", [KERNEL_HWCAP_MOPS] = "mops", [KERNEL_HWCAP_HBC] = "hbc", + [KERNEL_HWCAP_POE] = "poe", };
#ifdef CONFIG_COMPAT
On Wed, Sep 27, 2023 at 03:01:10PM +0100, Joey Gouly wrote:
Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to check if the CPU supports the feature.
Please also add the new hwcap to the hwcaps self test.
On Wed, Sep 27, 2023 at 03:01:10PM +0100, Joey Gouly wrote:
--- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -400,6 +400,7 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = { }; static const struct arm64_ftr_bits ftr_id_aa64mmfr3[] = {
- ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_S1POE_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_S1PIE_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_TCRX_SHIFT, 4, 0), ARM64_FTR_END,
@@ -2220,6 +2221,12 @@ static void cpu_enable_mops(const struct arm64_cpu_capabilities *__unused) sysreg_clear_set(sctlr_el1, 0, SCTLR_EL1_MSCEn); } +static void cpu_enable_poe(const struct arm64_cpu_capabilities *__unused) +{
- sysreg_clear_set(REG_TCR2_EL1, 0, TCR2_EL1x_E0POE);
- sysreg_clear_set(CPACR_EL1, 0, CPACR_ELx_E0POE);
+}
/* Internal helper functions to match cpu capability type */ static bool cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap) @@ -2724,6 +2731,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .capability = ARM64_HAS_S1POE, .type = ARM64_CPUCAP_BOOT_CPU_FEATURE, .matches = has_cpuid_feature,
ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP) }, {},.cpu_enable = cpu_enable_poe,
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
I'd also expect to see an update to arm64_elf_hwcaps[]?
The 3-bit POIndex is stored in the PTE at bits 60..62.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/pgtable-hwdef.h | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h index e4944d517c99..fe270fa39110 100644 --- a/arch/arm64/include/asm/pgtable-hwdef.h +++ b/arch/arm64/include/asm/pgtable-hwdef.h @@ -178,6 +178,16 @@ #define PTE_PI_IDX_2 53 /* PXN */ #define PTE_PI_IDX_3 54 /* UXN */
+/* + * POIndex[2:0] encoding (Permission Overlay Extension) + */ +#define PTE_PO_IDX_0 (_AT(pteval_t, 1) << 60) +#define PTE_PO_IDX_1 (_AT(pteval_t, 1) << 61) +#define PTE_PO_IDX_2 (_AT(pteval_t, 1) << 62) + +#define PTE_PO_IDX_MASK GENMASK_ULL(62, 60) + + /* * Memory Attribute override for Stage-2 (MemAttr[3:0]) */
Define the VM_PKEY_BIT* values for arm64, and convert them into the arm64 specific pgprot values.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/mman.h | 6 +++++- arch/arm64/mm/mmap.c | 7 +++++++ fs/proc/task_mmu.c | 2 ++ include/linux/mm.h | 11 ++++++++++- 4 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h index 5966ee4a6154..22de55518913 100644 --- a/arch/arm64/include/asm/mman.h +++ b/arch/arm64/include/asm/mman.h @@ -7,7 +7,7 @@ #include <uapi/asm/mman.h>
static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot, - unsigned long pkey __always_unused) + unsigned long pkey) { unsigned long ret = 0;
@@ -17,6 +17,10 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot, if (system_supports_mte() && (prot & PROT_MTE)) ret |= VM_MTE;
+ ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0; + ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0; + ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0; + return ret; } #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey) diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c index 8f5b7ce857ed..ef75b04f9d02 100644 --- a/arch/arm64/mm/mmap.c +++ b/arch/arm64/mm/mmap.c @@ -98,6 +98,13 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags) if (vm_flags & VM_MTE) prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
+ if (vm_flags & VM_PKEY_BIT0) + prot |= PTE_PO_IDX_0; + if (vm_flags & VM_PKEY_BIT1) + prot |= PTE_PO_IDX_1; + if (vm_flags & VM_PKEY_BIT2) + prot |= PTE_PO_IDX_2; + return __pgprot(prot); } EXPORT_SYMBOL(vm_get_page_prot); diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 3dd5be96691b..fcd94a39bb30 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -689,7 +689,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) [ilog2(VM_PKEY_BIT0)] = "", [ilog2(VM_PKEY_BIT1)] = "", [ilog2(VM_PKEY_BIT2)] = "", +#if VM_PKEY_BIT3 [ilog2(VM_PKEY_BIT3)] = "", +#endif #if VM_PKEY_BIT4 [ilog2(VM_PKEY_BIT4)] = "", #endif diff --git a/include/linux/mm.h b/include/linux/mm.h index bf5d0b1b16f4..43d96c925b9b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -328,7 +328,7 @@ extern unsigned int kobjsize(const void *objp); #define VM_HIGH_ARCH_5 BIT(VM_HIGH_ARCH_BIT_5) #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
-#ifdef CONFIG_ARCH_HAS_PKEYS +#if defined(CONFIG_ARCH_HAS_PKEYS) && !defined(CONFIG_ARM64) # define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0 # define VM_PKEY_BIT0 VM_HIGH_ARCH_0 /* A protection key is a 4-bit value */ # define VM_PKEY_BIT1 VM_HIGH_ARCH_1 /* on x86 and 5-bit value on ppc64 */ @@ -341,6 +341,15 @@ extern unsigned int kobjsize(const void *objp); #endif #endif /* CONFIG_ARCH_HAS_PKEYS */
+#if defined(CONFIG_ARM64) +# define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_2 +# define VM_PKEY_BIT0 VM_HIGH_ARCH_2 /* A protection key is a 3-bit value on arm64 */ +# define VM_PKEY_BIT1 VM_HIGH_ARCH_3 +# define VM_PKEY_BIT2 VM_HIGH_ARCH_4 +# define VM_PKEY_BIT3 0 +# define VM_PKEY_BIT4 0 +#endif + #ifdef CONFIG_X86_USER_SHADOW_STACK /* * VM_SHADOW_STACK should not be set with VM_SHARED because of lack of
Hi Joey,
kernel test robot noticed the following build errors:
[auto build test ERROR on arm64/for-next/core] [also build test ERROR on linus/master v6.6-rc3 next-20230929] [cannot apply to akpm-mm/mm-everything kvmarm/next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Joey-Gouly/arm64-sysreg-add-s... base: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core patch link: https://lore.kernel.org/r/20230927140123.5283-10-joey.gouly%40arm.com patch subject: [PATCH v1 09/20] arm64: define VM_PKEY_BIT* for arm64 config: arm64-randconfig-001-20230930 (https://download.01.org/0day-ci/archive/20230930/202309301944.jGh1qzvm-lkp@i...) compiler: aarch64-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230930/202309301944.jGh1qzvm-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202309301944.jGh1qzvm-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from fs/proc/meminfo.c:5: arch/arm64/include/asm/mman.h: In function 'arch_calc_vm_prot_bits':
include/linux/mm.h:346:25: error: 'VM_HIGH_ARCH_2' undeclared (first use in this function)
346 | # define VM_PKEY_BIT0 VM_HIGH_ARCH_2 /* A protection key is a 3-bit value on arm64 */ | ^~~~~~~~~~~~~~ arch/arm64/include/asm/mman.h:20:29: note: in expansion of macro 'VM_PKEY_BIT0' 20 | ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0; | ^~~~~~~~~~~~ include/linux/mm.h:346:25: note: each undeclared identifier is reported only once for each function it appears in 346 | # define VM_PKEY_BIT0 VM_HIGH_ARCH_2 /* A protection key is a 3-bit value on arm64 */ | ^~~~~~~~~~~~~~ arch/arm64/include/asm/mman.h:20:29: note: in expansion of macro 'VM_PKEY_BIT0' 20 | ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0; | ^~~~~~~~~~~~
include/linux/mm.h:347:25: error: 'VM_HIGH_ARCH_3' undeclared (first use in this function)
347 | # define VM_PKEY_BIT1 VM_HIGH_ARCH_3 | ^~~~~~~~~~~~~~ arch/arm64/include/asm/mman.h:21:29: note: in expansion of macro 'VM_PKEY_BIT1' 21 | ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0; | ^~~~~~~~~~~~
include/linux/mm.h:348:25: error: 'VM_HIGH_ARCH_4' undeclared (first use in this function)
348 | # define VM_PKEY_BIT2 VM_HIGH_ARCH_4 | ^~~~~~~~~~~~~~ arch/arm64/include/asm/mman.h:22:29: note: in expansion of macro 'VM_PKEY_BIT2' 22 | ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0; | ^~~~~~~~~~~~
vim +/VM_HIGH_ARCH_2 +346 include/linux/mm.h
343 344 #if defined(CONFIG_ARM64) 345 # define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_2
346 # define VM_PKEY_BIT0 VM_HIGH_ARCH_2 /* A protection key is a 3-bit value on arm64 */ 347 # define VM_PKEY_BIT1 VM_HIGH_ARCH_3 348 # define VM_PKEY_BIT2 VM_HIGH_ARCH_4
349 # define VM_PKEY_BIT3 0 350 # define VM_PKEY_BIT4 0 351 #endif 352
On 9/27/23 07:01, Joey Gouly wrote:
-#ifdef CONFIG_ARCH_HAS_PKEYS +#if defined(CONFIG_ARCH_HAS_PKEYS) && !defined(CONFIG_ARM64) # define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0 # define VM_PKEY_BIT0 VM_HIGH_ARCH_0 /* A protection key is a 4-bit value */ # define VM_PKEY_BIT1 VM_HIGH_ARCH_1 /* on x86 and 5-bit value on ppc64 */ @@ -341,6 +341,15 @@ extern unsigned int kobjsize(const void *objp); #endif #endif /* CONFIG_ARCH_HAS_PKEYS */ +#if defined(CONFIG_ARM64) +# define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_2 +# define VM_PKEY_BIT0 VM_HIGH_ARCH_2 /* A protection key is a 3-bit value on arm64 */ +# define VM_PKEY_BIT1 VM_HIGH_ARCH_3 +# define VM_PKEY_BIT2 VM_HIGH_ARCH_4 +# define VM_PKEY_BIT3 0 +# define VM_PKEY_BIT4 0 +#endif
This might be a bit cleaner to just defer out to a per-arch header. It made sense for ppc and x86 to share their copy in here, but now that a third one is around we should probably move this to actual arch/ code.
When a PTE is modified, the POIndex must be masked off so that it can be modified.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/pgtable.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 7f7d9b1df4e5..98ccfda05716 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -818,9 +818,10 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) * Normal and Normal-Tagged are two different memory types and indices * in MAIR_EL1. The mask below has to include PTE_ATTRINDX_MASK. */ - const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY | + pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY | PTE_PROT_NONE | PTE_VALID | PTE_WRITE | PTE_GP | - PTE_ATTRINDX_MASK; + PTE_ATTRINDX_MASK | PTE_PO_IDX_MASK; + /* preserve the hardware dirty information */ if (pte_hw_dirty(pte)) pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));
Enable the ARCH_HAS_PKEYS config, but provide dummy functions for the entire interface.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/pkeys.h | 54 ++++++++++++++++++++++++++++++++++ arch/arm64/mm/mmu.c | 5 ++++ 3 files changed, 60 insertions(+) create mode 100644 arch/arm64/include/asm/pkeys.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index b10515c0200b..c941259e0117 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -34,6 +34,7 @@ config ARM64 select ARCH_HAS_MEMBARRIER_SYNC_CORE select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE + select ARCH_HAS_PKEYS select ARCH_HAS_PTE_DEVMAP select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_SETUP_DMA_OPS diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h new file mode 100644 index 000000000000..5761fb48fd53 --- /dev/null +++ b/arch/arm64/include/asm/pkeys.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2023 Arm Ltd. + * + * Based on arch/x86/include/asm/pkeys.h +*/ + +#ifndef _ASM_ARM64_PKEYS_H +#define _ASM_ARM64_PKEYS_H + +#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2) + +#define arch_max_pkey() 0 + +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, + unsigned long init_val); + +static inline bool arch_pkeys_enabled(void) +{ + return false; +} + +static inline int vma_pkey(struct vm_area_struct *vma) +{ + return -1; +} + +static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma, + int prot, int pkey) +{ + return -1; +} + +static inline int execute_only_pkey(struct mm_struct *mm) +{ + return -1; +} + +static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey) +{ + return false; +} + +static inline int mm_pkey_alloc(struct mm_struct *mm) +{ + return -1; +} + +static inline int mm_pkey_free(struct mm_struct *mm, int pkey) +{ + return -EINVAL; +} + +#endif /* _ASM_ARM64_PKEYS_H */ diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 47781bec6171..3b7f354a3ec3 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1487,3 +1487,8 @@ void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte { set_pte_at(vma->vm_mm, addr, ptep, pte); } + +int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val) +{ + return -ENOSPC; +}
Hi Joey,
kernel test robot noticed the following build errors:
[auto build test ERROR on arm64/for-next/core] [also build test ERROR on linus/master v6.6-rc3 next-20230929] [cannot apply to akpm-mm/mm-everything kvmarm/next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Joey-Gouly/arm64-sysreg-add-s... base: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core patch link: https://lore.kernel.org/r/20230927140123.5283-12-joey.gouly%40arm.com patch subject: [PATCH v1 11/20] arm64: enable ARCH_HAS_PKEYS on arm64 config: arm64-randconfig-001-20230930 (https://download.01.org/0day-ci/archive/20230930/202309302050.SO3QXUkm-lkp@i...) compiler: aarch64-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230930/202309302050.SO3QXUkm-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202309302050.SO3QXUkm-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from arch/arm64/include/asm/hwcap.h:50, from arch/arm64/include/asm/cpufeature.h:12, from arch/arm64/include/asm/ptrace.h:11, from arch/arm64/include/asm/irqflags.h:10, from include/linux/irqflags.h:17, from include/linux/spinlock.h:59, from include/linux/mmzone.h:8, from include/linux/gfp.h:7, from include/linux/mm.h:7, from include/linux/pagewalk.h:5, from fs/proc/task_mmu.c:2: fs/proc/task_mmu.c: In function 'show_smap_vma_flags': include/linux/mm.h:346:25: error: 'VM_HIGH_ARCH_2' undeclared (first use in this function) 346 | # define VM_PKEY_BIT0 VM_HIGH_ARCH_2 /* A protection key is a 3-bit value on arm64 */ | ^~~~~~~~~~~~~~ include/linux/log2.h:158:30: note: in definition of macro 'ilog2' 158 | __builtin_constant_p(n) ? \ | ^ fs/proc/task_mmu.c:689:24: note: in expansion of macro 'VM_PKEY_BIT0' 689 | [ilog2(VM_PKEY_BIT0)] = "", | ^~~~~~~~~~~~ include/linux/mm.h:346:25: note: each undeclared identifier is reported only once for each function it appears in 346 | # define VM_PKEY_BIT0 VM_HIGH_ARCH_2 /* A protection key is a 3-bit value on arm64 */ | ^~~~~~~~~~~~~~ include/linux/log2.h:158:30: note: in definition of macro 'ilog2' 158 | __builtin_constant_p(n) ? \ | ^ fs/proc/task_mmu.c:689:24: note: in expansion of macro 'VM_PKEY_BIT0' 689 | [ilog2(VM_PKEY_BIT0)] = "", | ^~~~~~~~~~~~
include/linux/log2.h:157:1: error: array index in initializer not of integer type
157 | ( \ | ^ fs/proc/task_mmu.c:689:18: note: in expansion of macro 'ilog2' 689 | [ilog2(VM_PKEY_BIT0)] = "", | ^~~~~ include/linux/log2.h:157:1: note: (near initialization for 'mnemonics') 157 | ( \ | ^ fs/proc/task_mmu.c:689:18: note: in expansion of macro 'ilog2' 689 | [ilog2(VM_PKEY_BIT0)] = "", | ^~~~~ include/linux/mm.h:347:25: error: 'VM_HIGH_ARCH_3' undeclared (first use in this function) 347 | # define VM_PKEY_BIT1 VM_HIGH_ARCH_3 | ^~~~~~~~~~~~~~ include/linux/log2.h:158:30: note: in definition of macro 'ilog2' 158 | __builtin_constant_p(n) ? \ | ^ fs/proc/task_mmu.c:690:24: note: in expansion of macro 'VM_PKEY_BIT1' 690 | [ilog2(VM_PKEY_BIT1)] = "", | ^~~~~~~~~~~~
include/linux/log2.h:157:1: error: array index in initializer not of integer type
157 | ( \ | ^ fs/proc/task_mmu.c:690:18: note: in expansion of macro 'ilog2' 690 | [ilog2(VM_PKEY_BIT1)] = "", | ^~~~~ include/linux/log2.h:157:1: note: (near initialization for 'mnemonics') 157 | ( \ | ^ fs/proc/task_mmu.c:690:18: note: in expansion of macro 'ilog2' 690 | [ilog2(VM_PKEY_BIT1)] = "", | ^~~~~ include/linux/mm.h:348:25: error: 'VM_HIGH_ARCH_4' undeclared (first use in this function) 348 | # define VM_PKEY_BIT2 VM_HIGH_ARCH_4 | ^~~~~~~~~~~~~~ include/linux/log2.h:158:30: note: in definition of macro 'ilog2' 158 | __builtin_constant_p(n) ? \ | ^ fs/proc/task_mmu.c:691:24: note: in expansion of macro 'VM_PKEY_BIT2' 691 | [ilog2(VM_PKEY_BIT2)] = "", | ^~~~~~~~~~~~
include/linux/log2.h:157:1: error: array index in initializer not of integer type
157 | ( \ | ^ fs/proc/task_mmu.c:691:18: note: in expansion of macro 'ilog2' 691 | [ilog2(VM_PKEY_BIT2)] = "", | ^~~~~ include/linux/log2.h:157:1: note: (near initialization for 'mnemonics') 157 | ( \ | ^ fs/proc/task_mmu.c:691:18: note: in expansion of macro 'ilog2' 691 | [ilog2(VM_PKEY_BIT2)] = "", | ^~~~~
vim +157 include/linux/log2.h
b311e921b385b5a Robert P. J. Day 2007-10-16 69 f0d1b0b30d250a0 David Howells 2006-12-08 70 /** dbef91ec5482239 Martin Wilck 2018-04-18 71 * const_ilog2 - log base 2 of 32-bit or a 64-bit constant unsigned value a1c4d24e02d093e Randy Dunlap 2017-09-30 72 * @n: parameter f0d1b0b30d250a0 David Howells 2006-12-08 73 * dbef91ec5482239 Martin Wilck 2018-04-18 74 * Use this where sparse expects a true constant expression, e.g. for array dbef91ec5482239 Martin Wilck 2018-04-18 75 * indices. f0d1b0b30d250a0 David Howells 2006-12-08 76 */ dbef91ec5482239 Martin Wilck 2018-04-18 77 #define const_ilog2(n) \ f0d1b0b30d250a0 David Howells 2006-12-08 78 ( \ f0d1b0b30d250a0 David Howells 2006-12-08 79 __builtin_constant_p(n) ? ( \ 474c90156c8dcc2 Linus Torvalds 2017-03-02 80 (n) < 2 ? 0 : \ f0d1b0b30d250a0 David Howells 2006-12-08 81 (n) & (1ULL << 63) ? 63 : \ f0d1b0b30d250a0 David Howells 2006-12-08 82 (n) & (1ULL << 62) ? 62 : \ f0d1b0b30d250a0 David Howells 2006-12-08 83 (n) & (1ULL << 61) ? 61 : \ f0d1b0b30d250a0 David Howells 2006-12-08 84 (n) & (1ULL << 60) ? 60 : \ f0d1b0b30d250a0 David Howells 2006-12-08 85 (n) & (1ULL << 59) ? 59 : \ f0d1b0b30d250a0 David Howells 2006-12-08 86 (n) & (1ULL << 58) ? 58 : \ f0d1b0b30d250a0 David Howells 2006-12-08 87 (n) & (1ULL << 57) ? 57 : \ f0d1b0b30d250a0 David Howells 2006-12-08 88 (n) & (1ULL << 56) ? 56 : \ f0d1b0b30d250a0 David Howells 2006-12-08 89 (n) & (1ULL << 55) ? 55 : \ f0d1b0b30d250a0 David Howells 2006-12-08 90 (n) & (1ULL << 54) ? 54 : \ f0d1b0b30d250a0 David Howells 2006-12-08 91 (n) & (1ULL << 53) ? 53 : \ f0d1b0b30d250a0 David Howells 2006-12-08 92 (n) & (1ULL << 52) ? 52 : \ f0d1b0b30d250a0 David Howells 2006-12-08 93 (n) & (1ULL << 51) ? 51 : \ f0d1b0b30d250a0 David Howells 2006-12-08 94 (n) & (1ULL << 50) ? 50 : \ f0d1b0b30d250a0 David Howells 2006-12-08 95 (n) & (1ULL << 49) ? 49 : \ f0d1b0b30d250a0 David Howells 2006-12-08 96 (n) & (1ULL << 48) ? 48 : \ f0d1b0b30d250a0 David Howells 2006-12-08 97 (n) & (1ULL << 47) ? 47 : \ f0d1b0b30d250a0 David Howells 2006-12-08 98 (n) & (1ULL << 46) ? 46 : \ f0d1b0b30d250a0 David Howells 2006-12-08 99 (n) & (1ULL << 45) ? 45 : \ f0d1b0b30d250a0 David Howells 2006-12-08 100 (n) & (1ULL << 44) ? 44 : \ f0d1b0b30d250a0 David Howells 2006-12-08 101 (n) & (1ULL << 43) ? 43 : \ f0d1b0b30d250a0 David Howells 2006-12-08 102 (n) & (1ULL << 42) ? 42 : \ f0d1b0b30d250a0 David Howells 2006-12-08 103 (n) & (1ULL << 41) ? 41 : \ f0d1b0b30d250a0 David Howells 2006-12-08 104 (n) & (1ULL << 40) ? 40 : \ f0d1b0b30d250a0 David Howells 2006-12-08 105 (n) & (1ULL << 39) ? 39 : \ f0d1b0b30d250a0 David Howells 2006-12-08 106 (n) & (1ULL << 38) ? 38 : \ f0d1b0b30d250a0 David Howells 2006-12-08 107 (n) & (1ULL << 37) ? 37 : \ f0d1b0b30d250a0 David Howells 2006-12-08 108 (n) & (1ULL << 36) ? 36 : \ f0d1b0b30d250a0 David Howells 2006-12-08 109 (n) & (1ULL << 35) ? 35 : \ f0d1b0b30d250a0 David Howells 2006-12-08 110 (n) & (1ULL << 34) ? 34 : \ f0d1b0b30d250a0 David Howells 2006-12-08 111 (n) & (1ULL << 33) ? 33 : \ f0d1b0b30d250a0 David Howells 2006-12-08 112 (n) & (1ULL << 32) ? 32 : \ f0d1b0b30d250a0 David Howells 2006-12-08 113 (n) & (1ULL << 31) ? 31 : \ f0d1b0b30d250a0 David Howells 2006-12-08 114 (n) & (1ULL << 30) ? 30 : \ f0d1b0b30d250a0 David Howells 2006-12-08 115 (n) & (1ULL << 29) ? 29 : \ f0d1b0b30d250a0 David Howells 2006-12-08 116 (n) & (1ULL << 28) ? 28 : \ f0d1b0b30d250a0 David Howells 2006-12-08 117 (n) & (1ULL << 27) ? 27 : \ f0d1b0b30d250a0 David Howells 2006-12-08 118 (n) & (1ULL << 26) ? 26 : \ f0d1b0b30d250a0 David Howells 2006-12-08 119 (n) & (1ULL << 25) ? 25 : \ f0d1b0b30d250a0 David Howells 2006-12-08 120 (n) & (1ULL << 24) ? 24 : \ f0d1b0b30d250a0 David Howells 2006-12-08 121 (n) & (1ULL << 23) ? 23 : \ f0d1b0b30d250a0 David Howells 2006-12-08 122 (n) & (1ULL << 22) ? 22 : \ f0d1b0b30d250a0 David Howells 2006-12-08 123 (n) & (1ULL << 21) ? 21 : \ f0d1b0b30d250a0 David Howells 2006-12-08 124 (n) & (1ULL << 20) ? 20 : \ f0d1b0b30d250a0 David Howells 2006-12-08 125 (n) & (1ULL << 19) ? 19 : \ f0d1b0b30d250a0 David Howells 2006-12-08 126 (n) & (1ULL << 18) ? 18 : \ f0d1b0b30d250a0 David Howells 2006-12-08 127 (n) & (1ULL << 17) ? 17 : \ f0d1b0b30d250a0 David Howells 2006-12-08 128 (n) & (1ULL << 16) ? 16 : \ f0d1b0b30d250a0 David Howells 2006-12-08 129 (n) & (1ULL << 15) ? 15 : \ f0d1b0b30d250a0 David Howells 2006-12-08 130 (n) & (1ULL << 14) ? 14 : \ f0d1b0b30d250a0 David Howells 2006-12-08 131 (n) & (1ULL << 13) ? 13 : \ f0d1b0b30d250a0 David Howells 2006-12-08 132 (n) & (1ULL << 12) ? 12 : \ f0d1b0b30d250a0 David Howells 2006-12-08 133 (n) & (1ULL << 11) ? 11 : \ f0d1b0b30d250a0 David Howells 2006-12-08 134 (n) & (1ULL << 10) ? 10 : \ f0d1b0b30d250a0 David Howells 2006-12-08 135 (n) & (1ULL << 9) ? 9 : \ f0d1b0b30d250a0 David Howells 2006-12-08 136 (n) & (1ULL << 8) ? 8 : \ f0d1b0b30d250a0 David Howells 2006-12-08 137 (n) & (1ULL << 7) ? 7 : \ f0d1b0b30d250a0 David Howells 2006-12-08 138 (n) & (1ULL << 6) ? 6 : \ f0d1b0b30d250a0 David Howells 2006-12-08 139 (n) & (1ULL << 5) ? 5 : \ f0d1b0b30d250a0 David Howells 2006-12-08 140 (n) & (1ULL << 4) ? 4 : \ f0d1b0b30d250a0 David Howells 2006-12-08 141 (n) & (1ULL << 3) ? 3 : \ f0d1b0b30d250a0 David Howells 2006-12-08 142 (n) & (1ULL << 2) ? 2 : \ 474c90156c8dcc2 Linus Torvalds 2017-03-02 143 1) : \ dbef91ec5482239 Martin Wilck 2018-04-18 144 -1) dbef91ec5482239 Martin Wilck 2018-04-18 145 dbef91ec5482239 Martin Wilck 2018-04-18 146 /** dbef91ec5482239 Martin Wilck 2018-04-18 147 * ilog2 - log base 2 of 32-bit or a 64-bit unsigned value dbef91ec5482239 Martin Wilck 2018-04-18 148 * @n: parameter dbef91ec5482239 Martin Wilck 2018-04-18 149 * dbef91ec5482239 Martin Wilck 2018-04-18 150 * constant-capable log of base 2 calculation dbef91ec5482239 Martin Wilck 2018-04-18 151 * - this can be used to initialise global variables from constant data, hence dbef91ec5482239 Martin Wilck 2018-04-18 152 * the massive ternary operator construction dbef91ec5482239 Martin Wilck 2018-04-18 153 * dbef91ec5482239 Martin Wilck 2018-04-18 154 * selects the appropriately-sized optimised version depending on sizeof(n) dbef91ec5482239 Martin Wilck 2018-04-18 155 */ dbef91ec5482239 Martin Wilck 2018-04-18 156 #define ilog2(n) \ dbef91ec5482239 Martin Wilck 2018-04-18 @157 ( \ dbef91ec5482239 Martin Wilck 2018-04-18 158 __builtin_constant_p(n) ? \ 2f78788b55baa34 Jakub Jelinek 2020-12-15 159 ((n) < 2 ? 0 : \ 2f78788b55baa34 Jakub Jelinek 2020-12-15 160 63 - __builtin_clzll(n)) : \ f0d1b0b30d250a0 David Howells 2006-12-08 161 (sizeof(n) <= 4) ? \ f0d1b0b30d250a0 David Howells 2006-12-08 162 __ilog2_u32(n) : \ f0d1b0b30d250a0 David Howells 2006-12-08 163 __ilog2_u64(n) \ f0d1b0b30d250a0 David Howells 2006-12-08 164 ) f0d1b0b30d250a0 David Howells 2006-12-08 165
If a memory fault occurs that is due to an overlay/pkey fault, report that to userspace with a SEGV_PKUERR.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/traps.h | 1 + arch/arm64/kernel/traps.c | 12 ++++++++-- arch/arm64/mm/fault.c | 44 +++++++++++++++++++++++++++++++--- 3 files changed, 52 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h index d66dfb3a72dd..dae51eccfc19 100644 --- a/arch/arm64/include/asm/traps.h +++ b/arch/arm64/include/asm/traps.h @@ -26,6 +26,7 @@ try_emulate_armv8_deprecated(struct pt_regs *regs, u32 insn) void force_signal_inject(int signal, int code, unsigned long address, unsigned long err); void arm64_notify_segfault(unsigned long addr); void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str); +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far, const char *str, int pkey); void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str); void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index 8b70759cdbb9..b68682c284a2 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -263,16 +263,24 @@ static void arm64_show_signal(int signo, const char *str) __show_regs(regs); }
-void arm64_force_sig_fault(int signo, int code, unsigned long far, - const char *str) +void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far, + const char *str, int pkey) { arm64_show_signal(signo, str); if (signo == SIGKILL) force_sig(SIGKILL); + else if (code == SEGV_PKUERR) + force_sig_pkuerr((void __user *)far, pkey); else force_sig_fault(signo, code, (void __user *)far); }
+void arm64_force_sig_fault(int signo, int code, unsigned long far, + const char *str) +{ + arm64_force_sig_fault_pkey(signo, code, far, str, 0); +} + void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str) { diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 2e5d1e238af9..a76906199479 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -37,6 +37,7 @@ #include <asm/esr.h> #include <asm/kprobes.h> #include <asm/mte.h> +#include <asm/pkeys.h> #include <asm/processor.h> #include <asm/sysreg.h> #include <asm/system_misc.h> @@ -497,6 +498,23 @@ static void do_bad_area(unsigned long far, unsigned long esr, #define VM_FAULT_BADMAP ((__force vm_fault_t)0x010000) #define VM_FAULT_BADACCESS ((__force vm_fault_t)0x020000)
+static bool fault_from_pkey(unsigned long esr, struct vm_area_struct *vma, + unsigned int mm_flags) +{ + unsigned long iss2 = ESR_ELx_ISS2(esr); + + if (!arch_pkeys_enabled()) + return false; + + if (iss2 & ESR_ELx_Overlay) + return true; + + return !arch_vma_access_permitted(vma, + mm_flags & FAULT_FLAG_WRITE, + mm_flags & FAULT_FLAG_INSTRUCTION, + mm_flags & FAULT_FLAG_REMOTE); +} + static vm_fault_t __do_page_fault(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned int mm_flags, unsigned long vm_flags, @@ -688,9 +706,29 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, * Something tried to access memory that isn't in our memory * map. */ - arm64_force_sig_fault(SIGSEGV, - fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR, - far, inf->name); + int fault_kind; + /* + * The pkey value that we return to userspace can be different + * from the pkey that caused the fault. + * + * 1. T1 : mprotect_key(foo, PAGE_SIZE, pkey=4); + * 2. T1 : set AMR to deny access to pkey=4, touches, page + * 3. T1 : faults... + * 4. T2: mprotect_key(foo, PAGE_SIZE, pkey=5); + * 5. T1 : enters fault handler, takes mmap_lock, etc... + * 6. T1 : reaches here, sees vma_pkey(vma)=5, when we really + * faulted on a pte with its pkey=4. + */ + int pkey = vma_pkey(vma); + + if (fault_from_pkey(esr, vma, mm_flags)) + fault_kind = SEGV_PKUERR; + else + fault_kind = fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR; + + arm64_force_sig_fault_pkey(SIGSEGV, + fault_kind, + far, inf->name, pkey); }
return 0;
These functions will be extended to support pkeys.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/mmu_context.h | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index a6fb325424e7..c0eeed54225e 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -20,7 +20,6 @@ #include <asm/cpufeature.h> #include <asm/daifflags.h> #include <asm/proc-fns.h> -#include <asm-generic/mm_hooks.h> #include <asm/cputype.h> #include <asm/sysreg.h> #include <asm/tlbflush.h> @@ -209,6 +208,20 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm) return 0; }
+static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm) +{ + return 0; +} + +static inline void arch_exit_mmap(struct mm_struct *mm) +{ +} + +static inline void arch_unmap(struct mm_struct *mm, + unsigned long start, unsigned long end) +{ +} + #ifdef CONFIG_ARM64_SW_TTBR0_PAN static inline void update_saved_ttbr0(struct task_struct *tsk, struct mm_struct *mm) @@ -298,6 +311,12 @@ static inline unsigned long mm_untag_mask(struct mm_struct *mm) return -1UL >> 8; }
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, + bool write, bool execute, bool foreign) +{ + return true; +} + #include <asm-generic/mmu_context.h>
#endif /* !__ASSEMBLY__ */
Implement the PKEYS interface, using the Permission Overlay Extension.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/mmu.h | 2 + arch/arm64/include/asm/mmu_context.h | 32 ++++++++++++- arch/arm64/include/asm/pgtable.h | 23 +++++++++- arch/arm64/include/asm/pkeys.h | 68 +++++++++++++++++++++++++--- arch/arm64/include/asm/por.h | 33 ++++++++++++++ arch/arm64/mm/mmu.c | 35 +++++++++++++- 6 files changed, 184 insertions(+), 9 deletions(-) create mode 100644 arch/arm64/include/asm/por.h
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 94b68850cb9f..ed2cd66347d8 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -25,6 +25,8 @@ typedef struct { refcount_t pinned; void *vdso; unsigned long flags; + + u8 pkey_allocation_map; } mm_context_t;
/* diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index c0eeed54225e..aa739b87d49b 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -15,6 +15,7 @@ #include <linux/sched/hotplug.h> #include <linux/mm_types.h> #include <linux/pgtable.h> +#include <linux/pkeys.h>
#include <asm/cacheflush.h> #include <asm/cpufeature.h> @@ -205,11 +206,24 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm) { atomic64_set(&mm->context.id, 0); refcount_set(&mm->context.pinned, 0); + + // pkey 0 is the default, so always reserve it. + mm->context.pkey_allocation_map = 0x1; + return 0; }
+static inline void arch_dup_pkeys(struct mm_struct *oldmm, + struct mm_struct *mm) +{ + /* Duplicate the oldmm pkey state in mm: */ + mm->context.pkey_allocation_map = oldmm->context.pkey_allocation_map; +} + static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm) { + arch_dup_pkeys(oldmm, mm); + return 0; }
@@ -311,10 +325,26 @@ static inline unsigned long mm_untag_mask(struct mm_struct *mm) return -1UL >> 8; }
+/* + * We only want to enforce protection keys on the current process + * because we effectively have no access to POR_EL0 for other + * processes or any way to tell *which * POR_EL0 in a threaded + * process we could use. + * + * So do not enforce things if the VMA is not from the current + * mm, or if we are in a kernel thread. + */ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write, bool execute, bool foreign) { - return true; + if (!arch_pkeys_enabled()) + return true; + + /* allow access if the VMA is not one from this process */ + if (foreign || vma_is_foreign(vma)) + return true; + + return por_el0_allows_pkey(vma_pkey(vma), write, execute); }
#include <asm-generic/mmu_context.h> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 98ccfda05716..761575bbc943 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -30,6 +30,7 @@
#include <asm/cmpxchg.h> #include <asm/fixmap.h> +#include <asm/por.h> #include <linux/mmdebug.h> #include <linux/mm_types.h> #include <linux/sched.h> @@ -143,6 +144,24 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys) #define pte_accessible(mm, pte) \ (mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte))
+static inline bool por_el0_allows_pkey(u8 pkey, bool write, bool execute) +{ + u64 por; + + if (!cpus_have_final_cap(ARM64_HAS_S1POE)) + return true; + + por = read_sysreg_s(SYS_POR_EL0); + + if (write) + return por_elx_allows_write(por, pkey); + + if (execute) + return por_elx_allows_exec(por, pkey); + + return por_elx_allows_read(por, pkey); +} + /* * p??_access_permitted() is true for valid user mappings (PTE_USER * bit set, subject to the write permission check). For execute-only @@ -151,7 +170,9 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys) * PTE_VALID bit set. */ #define pte_access_permitted(pte, write) \ - (((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && (!(write) || pte_write(pte))) + (((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && \ + (!(write) || pte_write(pte)) && \ + por_el0_allows_pkey(FIELD_GET(PTE_PO_IDX_MASK, pte_val(pte)), write, false)) #define pmd_access_permitted(pmd, write) \ (pte_access_permitted(pmd_pte(pmd), (write))) #define pud_access_permitted(pud, write) \ diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h index 5761fb48fd53..a80c654da93d 100644 --- a/arch/arm64/include/asm/pkeys.h +++ b/arch/arm64/include/asm/pkeys.h @@ -10,7 +10,7 @@
#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2)
-#define arch_max_pkey() 0 +#define arch_max_pkey() 7
int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val); @@ -22,33 +22,89 @@ static inline bool arch_pkeys_enabled(void)
static inline int vma_pkey(struct vm_area_struct *vma) { - return -1; + return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT; }
static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot, int pkey) { - return -1; + if (pkey != -1) + return pkey; + + return vma_pkey(vma); }
static inline int execute_only_pkey(struct mm_struct *mm) { + // Execute-only mappings are handled by EPAN/FEAT_PAN3. + WARN_ON_ONCE(!cpus_have_final_cap(ARM64_HAS_EPAN)); + return -1; }
+#define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map) +#define mm_set_pkey_allocated(mm, pkey) do { \ + mm_pkey_allocation_map(mm) |= (1U << pkey); \ +} while (0) +#define mm_set_pkey_free(mm, pkey) do { \ + mm_pkey_allocation_map(mm) &= ~(1U << pkey); \ +} while (0) + static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey) { - return false; + /* + * "Allocated" pkeys are those that have been returned + * from pkey_alloc() or pkey 0 which is allocated + * implicitly when the mm is created. + */ + if (pkey < 0) + return false; + if (pkey >= arch_max_pkey()) + return false; + + return mm_pkey_allocation_map(mm) & (1U << pkey); }
+/* + * Returns a positive, 3-bit key on success, or -1 on failure. + */ static inline int mm_pkey_alloc(struct mm_struct *mm) { - return -1; + /* + * Note: this is the one and only place we make sure + * that the pkey is valid as far as the hardware is + * concerned. The rest of the kernel trusts that + * only good, valid pkeys come out of here. + */ + u8 all_pkeys_mask = ((1U << arch_max_pkey()) - 1); + int ret; + + if (!arch_pkeys_enabled()) + return -1; + + /* + * Are we out of pkeys? We must handle this specially + * because ffz() behavior is undefined if there are no + * zeros. + */ + if (mm_pkey_allocation_map(mm) == all_pkeys_mask) + return -1; + + ret = ffz(mm_pkey_allocation_map(mm)); + + mm_set_pkey_allocated(mm, ret); + + return ret; }
static inline int mm_pkey_free(struct mm_struct *mm, int pkey) { - return -EINVAL; + if (!mm_pkey_is_allocated(mm, pkey)) + return -EINVAL; + + mm_set_pkey_free(mm, pkey); + + return 0; }
#endif /* _ASM_ARM64_PKEYS_H */ diff --git a/arch/arm64/include/asm/por.h b/arch/arm64/include/asm/por.h new file mode 100644 index 000000000000..90484dae9920 --- /dev/null +++ b/arch/arm64/include/asm/por.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2023 Arm Ltd. +*/ + +#ifndef _ASM_ARM64_POR_H +#define _ASM_ARM64_POR_H + +#define POR_BITS_PER_PKEY 4 +#define POR_ELx_IDX(por_elx, idx) (((por_elx) >> (idx * POR_BITS_PER_PKEY)) & 0xf) + +static inline bool por_elx_allows_read(u64 por, u8 pkey) +{ + u8 perm = POR_ELx_IDX(por, pkey); + + return perm & POE_R; +} + +static inline bool por_elx_allows_write(u64 por, u8 pkey) +{ + u8 perm = POR_ELx_IDX(por, pkey); + + return perm & POE_W; +} + +static inline bool por_elx_allows_exec(u64 por, u8 pkey) +{ + u8 perm = POR_ELx_IDX(por, pkey); + + return perm & POE_X; +} + +#endif /* _ASM_ARM64_POR_H */ diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 3b7f354a3ec3..8241bdc365f9 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -25,6 +25,7 @@ #include <linux/vmalloc.h> #include <linux/set_memory.h> #include <linux/kfence.h> +#include <linux/pkeys.h>
#include <asm/barrier.h> #include <asm/cputype.h> @@ -1490,5 +1491,37 @@ void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte
int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val) { - return -ENOSPC; + u64 new_por = POE_RXW; + u64 old_por; + u64 pkey_shift; + + if (!arch_pkeys_enabled()) + return -ENOSPC; + + /* + * This code should only be called with valid 'pkey' + * values originating from in-kernel users. Complain + * if a bad value is observed. + */ + if (WARN_ON_ONCE(pkey >= arch_max_pkey())) + return -EINVAL; + + /* Set the bits we need in POR: */ + if (init_val & PKEY_DISABLE_ACCESS) + new_por = POE_X; + else if (init_val & PKEY_DISABLE_WRITE) + new_por = POE_RX; + + /* Shift the bits in to the correct place in POR for pkey: */ + pkey_shift = pkey * POR_BITS_PER_PKEY; + new_por <<= pkey_shift; + + /* Get old POR and mask off any old bits in place: */ + old_por = read_sysreg_s(SYS_POR_EL0); + old_por &= ~(POE_MASK << pkey_shift); + + /* Write old part along with new part: */ + write_sysreg_s(old_por | new_por, SYS_POR_EL0); + + return 0; }
Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/uapi/asm/sigcontext.h | 7 ++++ arch/arm64/kernel/signal.c | 51 ++++++++++++++++++++++++ 2 files changed, 58 insertions(+)
diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h index f23c1dc3f002..cef85eeaf541 100644 --- a/arch/arm64/include/uapi/asm/sigcontext.h +++ b/arch/arm64/include/uapi/asm/sigcontext.h @@ -98,6 +98,13 @@ struct esr_context { __u64 esr; };
+#define POE_MAGIC 0x504f4530 + +struct poe_context { + struct _aarch64_ctx head; + __u64 por_el0; +}; + /* * extra_context: describes extra space in the signal frame for * additional structures that don't fit in sigcontext.__reserved[]. diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index 0e8beb3349ea..3517271ae0dc 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -62,6 +62,7 @@ struct rt_sigframe_user_layout { unsigned long zt_offset; unsigned long extra_offset; unsigned long end_offset; + unsigned long poe_offset; };
#define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16) @@ -182,6 +183,8 @@ struct user_ctxs { u32 za_size; struct zt_context __user *zt; u32 zt_size; + struct poe_context __user *poe; + u32 poe_size; };
static int preserve_fpsimd_context(struct fpsimd_context __user *ctx) @@ -227,6 +230,20 @@ static int restore_fpsimd_context(struct user_ctxs *user) return err ? -EFAULT : 0; }
+static int restore_poe_context(struct user_ctxs *user) +{ + u64 por_el0; + int err = 0; + + if (user->poe_size != sizeof(*user->poe)) + return -EINVAL; + + __get_user_error(por_el0, &(user->poe->por_el0), err); + if (!err) + write_sysreg_s(por_el0, SYS_POR_EL0); + + return err; +}
#ifdef CONFIG_ARM64_SVE
@@ -590,6 +607,7 @@ static int parse_user_sigframe(struct user_ctxs *user, user->tpidr2 = NULL; user->za = NULL; user->zt = NULL; + user->poe = NULL;
if (!IS_ALIGNED((unsigned long)base, 16)) goto invalid; @@ -640,6 +658,17 @@ static int parse_user_sigframe(struct user_ctxs *user, /* ignore */ break;
+ case POE_MAGIC: + if (!cpus_have_final_cap(ARM64_HAS_S1POE)) + goto invalid; + + if (user->poe) + goto invalid; + + user->poe = (struct poe_context __user *)head; + user->poe_size = size; + break; + case SVE_MAGIC: if (!system_supports_sve() && !system_supports_sme()) goto invalid; @@ -812,6 +841,9 @@ static int restore_sigframe(struct pt_regs *regs, if (err == 0 && system_supports_sme2() && user.zt) err = restore_zt_context(&user);
+ if (err == 0 && cpus_have_final_cap(ARM64_HAS_S1POE) && user.poe) + err = restore_poe_context(&user); + return err; }
@@ -928,6 +960,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user, } }
+ if (cpus_have_const_cap(ARM64_HAS_S1POE)) { + err = sigframe_alloc(user, &user->poe_offset, + sizeof(struct poe_context)); + if (err) + return err; + } + return sigframe_alloc_end(user); }
@@ -968,6 +1007,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user, __put_user_error(current->thread.fault_code, &esr_ctx->esr, err); }
+ if (cpus_have_final_cap(ARM64_HAS_S1POE) && err == 0 && user->poe_offset) { + struct poe_context __user *poe_ctx = + apply_user_offset(user, user->poe_offset); + + __put_user_error(POE_MAGIC, &poe_ctx->head.magic, err); + __put_user_error(sizeof(*poe_ctx), &poe_ctx->head.size, err); + __put_user_error(read_sysreg_s(SYS_POR_EL0), &poe_ctx->por_el0, err); + } + /* Scalable Vector Extension state (including streaming), if present */ if ((system_supports_sve() || system_supports_sme()) && err == 0 && user->sve_offset) { @@ -1119,6 +1167,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka, sme_smstop(); }
+ if (cpus_have_final_cap(ARM64_HAS_S1POE)) + write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0); + if (ka->sa.sa_flags & SA_RESTORER) sigtramp = ka->sa.sa_restorer; else
On Wed, Sep 27, 2023 at 03:01:18PM +0100, Joey Gouly wrote:
Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
It'd be nice to have at least a basic test that validates that we generate a POE signal frame when expected, though that should be a very minor thing which is unlikely to ever actually spot anything.
I'd also expect to see matching support added to ptrace.
On Thu, Oct 05, 2023 at 03:34:29PM +0100, Mark Brown wrote:
On Wed, Sep 27, 2023 at 03:01:18PM +0100, Joey Gouly wrote:
Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
It'd be nice to have at least a basic test that validates that we generate a POE signal frame when expected, though that should be a very minor thing which is unlikely to ever actually spot anything.
Actually, now I think about it we at least need an update to the frame parser in userspace so it knows about the new frame. Without that it'll warn whenver it parses the signal context on any system that has POE enabled.
Hi Mark,
On Mon, Oct 09, 2023 at 03:49:29PM +0100, Mark Brown wrote:
On Thu, Oct 05, 2023 at 03:34:29PM +0100, Mark Brown wrote:
On Wed, Sep 27, 2023 at 03:01:18PM +0100, Joey Gouly wrote:
Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
It'd be nice to have at least a basic test that validates that we generate a POE signal frame when expected, though that should be a very minor thing which is unlikely to ever actually spot anything.
Actually, now I think about it we at least need an update to the frame parser in userspace so it knows about the new frame. Without that it'll warn whenver it parses the signal context on any system that has POE enabled.
Do you mean in libc?
Thanks, Joey
On Tue, Oct 10, 2023 at 10:58:02AM +0100, Joey Gouly wrote:
On Mon, Oct 09, 2023 at 03:49:29PM +0100, Mark Brown wrote:
Actually, now I think about it we at least need an update to the frame parser in userspace so it knows about the new frame. Without that it'll warn whenver it parses the signal context on any system that has POE enabled.
Do you mean in libc?
No, in the signal handling selftests.
Hi Mark,
On Thu, Oct 05, 2023 at 03:34:29PM +0100, Mark Brown wrote:
On Wed, Sep 27, 2023 at 03:01:18PM +0100, Joey Gouly wrote:
Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
It'd be nice to have at least a basic test that validates that we generate a POE signal frame when expected, though that should be a very minor thing which is unlikely to ever actually spot anything.
I'd also expect to see matching support added to ptrace.
The selftests/mm/protection_keys.c looks for the POE signal frame, do you think we need a separate test?
I will look into ptrace.
Thanks, Joey
On Tue, Oct 10, 2023 at 10:57:02AM +0100, Joey Gouly wrote:
On Thu, Oct 05, 2023 at 03:34:29PM +0100, Mark Brown wrote:
On Wed, Sep 27, 2023 at 03:01:18PM +0100, Joey Gouly wrote:
Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe.
It'd be nice to have at least a basic test that validates that we generate a POE signal frame when expected, though that should be a very minor thing which is unlikely to ever actually spot anything.
The selftests/mm/protection_keys.c looks for the POE signal frame, do you think we need a separate test?
Like I say it'd be a very minor thing to have one - it is more just a thing you'd go looking for in the signals tests rather than something that's absolutely essential. For trivial things like TPIDR2 I've just added a trivial thing that verifies that the frame is present iff the matching HWCAP is set. Having the test in the mm tests is probably fine though.
Now that PKEYs support has been implemented, enable it for CPUs that support S1POE.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/pkeys.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h index a80c654da93d..9946de35035d 100644 --- a/arch/arm64/include/asm/pkeys.h +++ b/arch/arm64/include/asm/pkeys.h @@ -17,7 +17,7 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
static inline bool arch_pkeys_enabled(void) { - return false; + return cpus_have_final_cap(ARM64_HAS_S1POE); }
static inline int vma_pkey(struct vm_area_struct *vma)
Set the EL0/userspace indirection encodings to be the overlay enabled variants of the permissions.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/pgtable-prot.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h index eed814b00a38..06fb944ef252 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -141,10 +141,10 @@ extern bool arm64_use_ng_mappings;
#define PIE_E0 ( \ PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY), PIE_X_O) | \ - PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX) | \ - PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC), PIE_RWX) | \ - PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY), PIE_R) | \ - PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED), PIE_RW)) + PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX_O) | \ + PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC), PIE_RWX_O) | \ + PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY), PIE_R_O) | \ + PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED), PIE_RW_O))
#define PIE_E1 ( \ PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY), PIE_NONE_O) | \
Put this function in the header so that it can be used by other tests, without needing to link to testcases.c.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Shuah Khan shuah@kernel.org Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com --- .../arm64/signal/testcases/testcases.c | 23 ------------------- .../arm64/signal/testcases/testcases.h | 23 ++++++++++++++++++- 2 files changed, 22 insertions(+), 24 deletions(-)
diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.c b/tools/testing/selftests/arm64/signal/testcases/testcases.c index 9f580b55b388..fe950b6bca6b 100644 --- a/tools/testing/selftests/arm64/signal/testcases/testcases.c +++ b/tools/testing/selftests/arm64/signal/testcases/testcases.c @@ -6,29 +6,6 @@
#include "testcases.h"
-struct _aarch64_ctx *get_header(struct _aarch64_ctx *head, uint32_t magic, - size_t resv_sz, size_t *offset) -{ - size_t offs = 0; - struct _aarch64_ctx *found = NULL; - - if (!head || resv_sz < HDR_SZ) - return found; - - while (offs <= resv_sz - HDR_SZ && - head->magic != magic && head->magic) { - offs += head->size; - head = GET_RESV_NEXT_HEAD(head); - } - if (head->magic == magic) { - found = head; - if (offset) - *offset = offs; - } - - return found; -} - bool validate_extra_context(struct extra_context *extra, char **err, void **extra_data, size_t *extra_size) { diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.h b/tools/testing/selftests/arm64/signal/testcases/testcases.h index a08ab0d6207a..c9bd18e7c538 100644 --- a/tools/testing/selftests/arm64/signal/testcases/testcases.h +++ b/tools/testing/selftests/arm64/signal/testcases/testcases.h @@ -88,7 +88,28 @@ struct fake_sigframe { bool validate_reserved(ucontext_t *uc, size_t resv_sz, char **err);
struct _aarch64_ctx *get_header(struct _aarch64_ctx *head, uint32_t magic, - size_t resv_sz, size_t *offset); + size_t resv_sz, size_t *offset) +{ + size_t offs = 0; + struct _aarch64_ctx *found = NULL; + + if (!head || resv_sz < HDR_SZ) + return found; + + while (offs <= resv_sz - HDR_SZ && + head->magic != magic && head->magic) { + offs += head->size; + head = GET_RESV_NEXT_HEAD(head); + } + if (head->magic == magic) { + found = head; + if (offset) + *offset = offs; + } + + return found; +} +
static inline struct _aarch64_ctx *get_terminator(struct _aarch64_ctx *head, size_t resv_sz,
arm64's fpregs are not at a constant offset from sigcontext. Since this is not an important part of the test, don't print the fpregs pointer on arm64.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Shuah Khan shuah@kernel.org Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com --- tools/testing/selftests/mm/pkey-powerpc.h | 1 + tools/testing/selftests/mm/pkey-x86.h | 2 ++ tools/testing/selftests/mm/protection_keys.c | 6 ++++++ 3 files changed, 9 insertions(+)
diff --git a/tools/testing/selftests/mm/pkey-powerpc.h b/tools/testing/selftests/mm/pkey-powerpc.h index ae5df26104e5..6275d0f474b3 100644 --- a/tools/testing/selftests/mm/pkey-powerpc.h +++ b/tools/testing/selftests/mm/pkey-powerpc.h @@ -9,6 +9,7 @@ #endif #define REG_IP_IDX PT_NIP #define REG_TRAPNO PT_TRAP +#define MCONTEXT_FPREGS #define gregs gp_regs #define fpregs fp_regs #define si_pkey_offset 0x20 diff --git a/tools/testing/selftests/mm/pkey-x86.h b/tools/testing/selftests/mm/pkey-x86.h index 814758e109c0..b9170a26bfcb 100644 --- a/tools/testing/selftests/mm/pkey-x86.h +++ b/tools/testing/selftests/mm/pkey-x86.h @@ -15,6 +15,8 @@
#endif
+#define MCONTEXT_FPREGS + #ifndef PKEY_DISABLE_ACCESS # define PKEY_DISABLE_ACCESS 0x1 #endif diff --git a/tools/testing/selftests/mm/protection_keys.c b/tools/testing/selftests/mm/protection_keys.c index 48dc151f8fca..b3dbd76ea27c 100644 --- a/tools/testing/selftests/mm/protection_keys.c +++ b/tools/testing/selftests/mm/protection_keys.c @@ -314,7 +314,9 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext) ucontext_t *uctxt = vucontext; int trapno; unsigned long ip; +#ifdef MCONTEXT_FPREGS char *fpregs; +#endif #if defined(__i386__) || defined(__x86_64__) /* arch */ u32 *pkey_reg_ptr; int pkey_reg_offset; @@ -330,7 +332,9 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext)
trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO]; ip = uctxt->uc_mcontext.gregs[REG_IP_IDX]; +#ifdef MCONTEXT_FPREGS fpregs = (char *) uctxt->uc_mcontext.fpregs; +#endif
dprintf2("%s() trapno: %d ip: 0x%016lx info->si_code: %s/%d\n", __func__, trapno, ip, si_code_str(si->si_code), @@ -359,7 +363,9 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext) #endif /* arch */
dprintf1("siginfo: %p\n", si); +#ifdef MCONTEXT_FPREGS dprintf1(" fpregs: %p\n", fpregs); +#endif
if ((si->si_code == SEGV_MAPERR) || (si->si_code == SEGV_ACCERR) ||
On 9/27/23 07:01, Joey Gouly wrote:
arm64's fpregs are not at a constant offset from sigcontext. Since this is not an important part of the test, don't print the fpregs pointer on arm64.
Acked-by: Dave Hansen dave.hansen@linux.intel.com
The encoding of the pkey register differs on arm64, than on x86/ppc. On those platforms, a bit in the register is used to disable permissions, for arm64, a bit enabled in the register indicates that the permission is allowed.
This drops two asserts of the form: assert(read_pkey_reg() <= orig_pkey_reg); Because on arm64 this doesn't hold, due to the encoding.
The pkey must be reset to both access allow and write allow in the signal handler. pkey_access_allow() works currently for PowerPC as the PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE have overlapping bits set.
Access to the uc_mcontext is abstracted, as arm64 has a different structure.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Shuah Khan shuah@kernel.org Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com --- .../arm64/signal/testcases/testcases.h | 3 + tools/testing/selftests/mm/Makefile | 2 +- tools/testing/selftests/mm/pkey-arm64.h | 138 ++++++++++++++++++ tools/testing/selftests/mm/pkey-helpers.h | 8 + tools/testing/selftests/mm/pkey-powerpc.h | 2 + tools/testing/selftests/mm/pkey-x86.h | 2 + tools/testing/selftests/mm/protection_keys.c | 23 +-- 7 files changed, 167 insertions(+), 11 deletions(-) create mode 100644 tools/testing/selftests/mm/pkey-arm64.h
diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.h b/tools/testing/selftests/arm64/signal/testcases/testcases.h index c9bd18e7c538..b6c591117d07 100644 --- a/tools/testing/selftests/arm64/signal/testcases/testcases.h +++ b/tools/testing/selftests/arm64/signal/testcases/testcases.h @@ -25,6 +25,9 @@ #define HDR_SZ \ sizeof(struct _aarch64_ctx)
+#define GET_UC_RESV_HEAD(uc) \ + (struct _aarch64_ctx *)(&(uc->uc_mcontext.__reserved)) + #define GET_SF_RESV_HEAD(sf) \ (struct _aarch64_ctx *)(&(sf).uc.uc_mcontext.__reserved)
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index 6a9fc5693145..f50987233f55 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -95,7 +95,7 @@ TEST_GEN_FILES += $(BINARIES_64) endif else
-ifneq (,$(findstring $(ARCH),ppc64)) +ifneq (,$(filter $(ARCH),arm64 ppc64)) TEST_GEN_FILES += protection_keys endif
diff --git a/tools/testing/selftests/mm/pkey-arm64.h b/tools/testing/selftests/mm/pkey-arm64.h new file mode 100644 index 000000000000..3a5d8ba69bb7 --- /dev/null +++ b/tools/testing/selftests/mm/pkey-arm64.h @@ -0,0 +1,138 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2023 Arm Ltd. +*/ + +#ifndef _PKEYS_ARM64_H +#define _PKEYS_ARM64_H + +#include "vm_util.h" +/* for signal frame parsing */ +#include "../arm64/signal/testcases/testcases.h" + +#ifndef SYS_mprotect_key +# define SYS_mprotect_key 288 +#endif +#ifndef SYS_pkey_alloc +# define SYS_pkey_alloc 289 +# define SYS_pkey_free 290 +#endif +#define MCONTEXT_IP(mc) mc.pc +#define MCONTEXT_TRAPNO(mc) -1 + +#define PKEY_MASK 0xf + +#define POE_X 0x2 +#define POE_RX 0x3 +#define POE_RWX 0x7 + +#define NR_PKEYS 7 +#define NR_RESERVED_PKEYS 1 /* pkey-0 */ + +#define PKEY_ALLOW_ALL 0x77777777 + +#define PKEY_BITS_PER_PKEY 4 +#define PAGE_SIZE sysconf(_SC_PAGESIZE) +#undef HPAGE_SIZE +#define HPAGE_SIZE default_huge_page_size() + +/* 4-byte instructions * 16384 = 64K page */ +#define __page_o_noops() asm(".rept 16384 ; nop; .endr") + +static inline u64 __read_pkey_reg(void) +{ + u64 pkey_reg = 0; + + // POR_EL0 + asm volatile("mrs %0, S3_3_c10_c2_4" : "=r" (pkey_reg)); + + return pkey_reg; +} + +static inline void __write_pkey_reg(u64 pkey_reg) +{ + u64 por = pkey_reg; + + dprintf4("%s() changing %016llx to %016llx\n", + __func__, __read_pkey_reg(), pkey_reg); + + // POR_EL0 + asm volatile("msr S3_3_c10_c2_4, %0\nisb" :: "r" (por) :); + + dprintf4("%s() pkey register after changing %016llx to %016llx\n", + __func__, __read_pkey_reg(), pkey_reg); +} + +static inline int cpu_has_pkeys(void) +{ + /* No simple way to determine this */ + return 1; +} + +static inline u32 pkey_bit_position(int pkey) +{ + return pkey * PKEY_BITS_PER_PKEY; +} + +static inline int get_arch_reserved_keys(void) +{ + return NR_RESERVED_PKEYS; +} + +void expect_fault_on_read_execonly_key(void *p1, int pkey) +{ +} + +void *malloc_pkey_with_mprotect_subpage(long size, int prot, u16 pkey) +{ + return PTR_ERR_ENOTSUP; +} + +#define set_pkey_bits set_pkey_bits +static inline u64 set_pkey_bits(u64 reg, int pkey, u64 flags) +{ + u32 shift = pkey_bit_position(pkey); + u64 new_val = POE_RWX; + + /* mask out bits from pkey in old value */ + reg &= ~((u64)PKEY_MASK << shift); + + if (flags & PKEY_DISABLE_ACCESS) + new_val = POE_X; + else if (flags & PKEY_DISABLE_WRITE) + new_val = POE_RX; + + /* OR in new bits for pkey */ + reg |= new_val << shift; + + return reg; +} + +#define get_pkey_bits get_pkey_bits +static inline u64 get_pkey_bits(u64 reg, int pkey) +{ + u32 shift = pkey_bit_position(pkey); + /* + * shift down the relevant bits to the lowest two, then + * mask off all the other higher bits + */ + u32 perm = (reg >> shift) & PKEY_MASK; + + if (perm == POE_X) + return PKEY_DISABLE_ACCESS; + if (perm == POE_RX) + return PKEY_DISABLE_WRITE; + return 0; +} + +static void aarch64_write_signal_pkey(ucontext_t *uctxt, u64 pkey) +{ + struct _aarch64_ctx *ctx = GET_UC_RESV_HEAD(uctxt); + struct poe_context *poe_ctx = + (struct poe_context *) get_header(ctx, POE_MAGIC, + sizeof(uctxt->uc_mcontext), NULL); + if (poe_ctx) + poe_ctx->por_el0 = pkey; +} + +#endif /* _PKEYS_ARM64_H */ diff --git a/tools/testing/selftests/mm/pkey-helpers.h b/tools/testing/selftests/mm/pkey-helpers.h index 92f3be3dd8e5..07d131f98043 100644 --- a/tools/testing/selftests/mm/pkey-helpers.h +++ b/tools/testing/selftests/mm/pkey-helpers.h @@ -91,12 +91,17 @@ void record_pkey_malloc(void *ptr, long size, int prot); #include "pkey-x86.h" #elif defined(__powerpc64__) /* arch */ #include "pkey-powerpc.h" +#elif defined(__aarch64__) /* arch */ +#include "pkey-arm64.h" #else /* arch */ #error Architecture not supported #endif /* arch */
+#ifndef PKEY_MASK #define PKEY_MASK (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE) +#endif
+#ifndef set_pkey_bits static inline u64 set_pkey_bits(u64 reg, int pkey, u64 flags) { u32 shift = pkey_bit_position(pkey); @@ -106,7 +111,9 @@ static inline u64 set_pkey_bits(u64 reg, int pkey, u64 flags) reg |= (flags & PKEY_MASK) << shift; return reg; } +#endif
+#ifndef get_pkey_bits static inline u64 get_pkey_bits(u64 reg, int pkey) { u32 shift = pkey_bit_position(pkey); @@ -116,6 +123,7 @@ static inline u64 get_pkey_bits(u64 reg, int pkey) */ return ((reg >> shift) & PKEY_MASK); } +#endif
extern u64 shadow_pkey_reg;
diff --git a/tools/testing/selftests/mm/pkey-powerpc.h b/tools/testing/selftests/mm/pkey-powerpc.h index 6275d0f474b3..3d0c0bdae5bc 100644 --- a/tools/testing/selftests/mm/pkey-powerpc.h +++ b/tools/testing/selftests/mm/pkey-powerpc.h @@ -8,6 +8,8 @@ # define SYS_pkey_free 385 #endif #define REG_IP_IDX PT_NIP +#define MCONTEXT_IP(mc) mc.gp_regs[REG_IP_IDX] +#define MCONTEXT_TRAPNO(mc) mc.gp_regs[REG_TRAPNO] #define REG_TRAPNO PT_TRAP #define MCONTEXT_FPREGS #define gregs gp_regs diff --git a/tools/testing/selftests/mm/pkey-x86.h b/tools/testing/selftests/mm/pkey-x86.h index b9170a26bfcb..5f28e26a2511 100644 --- a/tools/testing/selftests/mm/pkey-x86.h +++ b/tools/testing/selftests/mm/pkey-x86.h @@ -15,6 +15,8 @@
#endif
+#define MCONTEXT_IP(mc) mc.gregs[REG_IP_IDX] +#define MCONTEXT_TRAPNO(mc) mc.gregs[REG_TRAPNO] #define MCONTEXT_FPREGS
#ifndef PKEY_DISABLE_ACCESS diff --git a/tools/testing/selftests/mm/protection_keys.c b/tools/testing/selftests/mm/protection_keys.c index b3dbd76ea27c..14883d551531 100644 --- a/tools/testing/selftests/mm/protection_keys.c +++ b/tools/testing/selftests/mm/protection_keys.c @@ -147,7 +147,7 @@ void abort_hooks(void) * will then fault, which makes sure that the fault code handles * execute-only memory properly. */ -#ifdef __powerpc64__ +#if defined(__powerpc64__) || defined(__aarch64__) /* This way, both 4K and 64K alignment are maintained */ __attribute__((__aligned__(65536))) #else @@ -212,7 +212,6 @@ void pkey_disable_set(int pkey, int flags) unsigned long syscall_flags = 0; int ret; int pkey_rights; - u64 orig_pkey_reg = read_pkey_reg();
dprintf1("START->%s(%d, 0x%x)\n", __func__, pkey, flags); @@ -242,8 +241,6 @@ void pkey_disable_set(int pkey, int flags)
dprintf1("%s(%d) pkey_reg: 0x%016llx\n", __func__, pkey, read_pkey_reg()); - if (flags) - pkey_assert(read_pkey_reg() >= orig_pkey_reg); dprintf1("END<---%s(%d, 0x%x)\n", __func__, pkey, flags); } @@ -253,7 +250,6 @@ void pkey_disable_clear(int pkey, int flags) unsigned long syscall_flags = 0; int ret; int pkey_rights = hw_pkey_get(pkey, syscall_flags); - u64 orig_pkey_reg = read_pkey_reg();
pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
@@ -273,8 +269,6 @@ void pkey_disable_clear(int pkey, int flags)
dprintf1("%s(%d) pkey_reg: 0x%016llx\n", __func__, pkey, read_pkey_reg()); - if (flags) - assert(read_pkey_reg() <= orig_pkey_reg); }
void pkey_write_allow(int pkey) @@ -330,8 +324,8 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext) __func__, __LINE__, __read_pkey_reg(), shadow_pkey_reg);
- trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO]; - ip = uctxt->uc_mcontext.gregs[REG_IP_IDX]; + trapno = MCONTEXT_TRAPNO(uctxt->uc_mcontext); + ip = MCONTEXT_IP(uctxt->uc_mcontext); #ifdef MCONTEXT_FPREGS fpregs = (char *) uctxt->uc_mcontext.fpregs; #endif @@ -395,6 +389,8 @@ void signal_handler(int signum, siginfo_t *si, void *vucontext) #elif defined(__powerpc64__) /* arch */ /* restore access and let the faulting instruction continue */ pkey_access_allow(siginfo_pkey); +#elif defined(__aarch64__) + aarch64_write_signal_pkey(uctxt, PKEY_ALLOW_ALL); #endif /* arch */ pkey_faults++; dprintf1("<<<<==================================================\n"); @@ -908,7 +904,9 @@ void expected_pkey_fault(int pkey) * test program continue. We now have to restore it. */ if (__read_pkey_reg() != 0) -#else /* arch */ +#elif defined(__aarch64__) + if (__read_pkey_reg() != PKEY_ALLOW_ALL) +#else if (__read_pkey_reg() != shadow_pkey_reg) #endif /* arch */ pkey_assert(0); @@ -1498,6 +1496,11 @@ void test_executing_on_unreadable_memory(int *ptr, u16 pkey) lots_o_noops_around_write(&scratch); do_not_expect_pkey_fault("executing on PROT_EXEC memory"); expect_fault_on_read_execonly_key(p1, pkey); + + // Reset back to PROT_EXEC | PROT_READ for architectures that support + // non-PKEY execute-only permissions. + ret = mprotect_pkey(p1, PAGE_SIZE, PROT_EXEC | PROT_READ, (u64)pkey); + pkey_assert(!ret); }
void test_implicit_mprotect_exec_only_memory(int *ptr, u16 pkey)
On 9/27/23 07:01, Joey Gouly wrote:
The encoding of the pkey register differs on arm64, than on x86/ppc. On those platforms, a bit in the register is used to disable permissions, for arm64, a bit enabled in the register indicates that the permission is allowed.
This drops two asserts of the form: assert(read_pkey_reg() <= orig_pkey_reg); Because on arm64 this doesn't hold, due to the encoding.
The pkey must be reset to both access allow and write allow in the signal handler. pkey_access_allow() works currently for PowerPC as the PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE have overlapping bits set.
Access to the uc_mcontext is abstracted, as arm64 has a different structure.
This all looks sane enough. Welcome to the pkey party! :)
Acked-by: Dave Hansen dave.hansen@linux.intel.com
linux-kselftest-mirror@lists.linaro.org