Hi All,
I pushed a draft version of backport to git://git.linaro.org/kernel/linux-linaro-stable.git lts-v4.9-kpti
The backport based on arm tree: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti
The following commit aren't included for couple reasons: a, bpf patch included in LTS; b falkor isn't supported in LTS c, PAN isn't supported on LTS;
bpf: prevent out-of-bounds speculation arm64: Implement branch predictor hardening for Falkor arm64: kpti: Fix the interaction between ASID switching and software PAN arm64: mm: Introduce TTBR_ASID_MASK for getting at the ASID in the TTBR perf: arm_spe: Fail device probe when arm64_kernel_unmapped_at_el0() arm64: erratum: Work around Falkor erratum #E1003 in trampoline code arm64: mm: Fix and re-enable ARM64_SW_TTBR0_PAN arm64: mm: Rename post_ttbr0_update_workaround arm64: mm: Remove pre_ttbr0_update_workaround for Falkor erratum #E1003 arm64: mm: Temporarily disable ARM64_SW_TTBR0_PAN
The kernelci testing show it can boot on 82 boards but failed on 12 boards on latest commit c9966001e6cac2e7c9f8 https://kernelci.org/boot/all/job/lsk/branch/linux-linaro-lsk-v4.9-test/kern...
Debug show the failure is due to __bp_harden_hyp_vecs_start isn't mapped during booting. So guess I missed some commits from arch/arm/kvm/arm.c to virt/kvm/arm/arm.c when doing pick up: d00aff63b0b arm64: KVM: Use per-CPU vector when BP hardening is enabled
Any hints are appreciated! kvm_arch_init -> init_hyp_mode(conditionally) ->kvm_map_vectors -> create_hyp_mappings(__bp_harden_hyp_vecs_start)
Regards Alex
CPU features: detected feature: 32-bit EL0 Support [ 0.092699] CPU features: detected feature: Kernel page table isolation (KPTI) [ 0.100912] Unable to handle kernel paging request at virtual address ffff800000095000 [ 0.101805] pgd = ffff000008e81000 [ 0.102224] [ffff800000095000] *pgd=000000005eff8003, *pud=000000005eff7003, *pmd=000000005eff6003, *pte=00e0000040095f93 [ 0.104045] Internal error: Oops: 9600004f [#1] PREEMPT SMP [ 0.104710] Modules linked in: [ 0.105400] CPU: 0 PID: 10 Comm: migration/0 Not tainted 4.9.78-00038-gc996600-dirty #67 [ 0.105993] Hardware name: linux,dummy-virt (DT) [ 0.106548] task: ffff80001d86f080 task.stack: ffff80001d8a8000 [ 0.108086] PC is at __memcpy+0xc0/0x180 [ 0.108422] LR is at enable_psci_bp_hardening+0x190/0x218 [ 0.108830] pc : [<ffff0000083786c0>] lr : [<ffff00000808ca98>] pstate: 800000c5 [ 0.109294] sp : ffff80001d8abcd0
[PATCH 01/38] arm64: mm: Use non-global mappings for kernel space [PATCH 02/38] arm64: mm: Move ASID from TTBR0 to TTBR1 [PATCH 03/38] arm64: mm: Allocate ASIDs in pairs [PATCH 04/38] arm64: mm: Add arm64_kernel_unmapped_at_el0 helper [PATCH 05/38] arm64: mm: Invalidate both kernel and user ASIDs when [PATCH 06/38] arm64: factor out entry stack manipulation [PATCH 07/38] arm64: entry.S: move SError handling into a C function [PATCH 08/38] module: extend 'rodata=off' boot cmdline parameter to [PATCH 09/38] arm64: entry: Add exception trampoline page for [PATCH 10/38] arm64: mm: Map entry trampoline into trampoline and [PATCH 11/38] arm64: entry: Explicitly pass exception level to [PATCH 12/38] arm64: entry: Hook up entry trampoline to exception [PATCH 13/38] arm64: tls: Avoid unconditional zeroing of tpidrro_el0 [PATCH 14/38] arm64: entry: Add fake CPU feature for unmapping the [PATCH 15/38] arm64: Kconfig: Add CONFIG_UNMAP_KERNEL_AT_EL0 [PATCH 16/38] arm64: kaslr: Put kernel vectors address in separate [PATCH 17/38] arm64: cpufeature: Pass capability structure to [PATCH 18/38] arm64: Allow checking of a CPU-local erratum [PATCH 19/38] arm64: capabilities: Handle duplicate entries for a [PATCH 20/38] arm64: use RET instruction for exiting the trampoline [PATCH 21/38] arm64: Kconfig: Reword UNMAP_KERNEL_AT_EL0 kconfig [PATCH 22/38] arm64: Take into account ID_AA64PFR0_EL1.CSV3 [PATCH 23/38] drivers/firmware: Expose psci_get_version through [PATCH 24/38] mm: Introduce lm_alias [PATCH 25/38] arm64: Move post_ttbr_update_workaround to C code [PATCH 26/38] arm64: Add skeleton to harden the branch predictor [PATCH 27/38] arm64: KVM: Use per-CPU vector when BP hardening is [PATCH 28/38] arm64: KVM: Make PSCI_VERSION a fast path [PATCH 29/38] arm64: cpu_errata: Allow an erratum to be match for all [PATCH 30/38] arm64: cputype: Add missing MIDR values for Cortex-A72 [PATCH 31/38] arm64: Implement branch predictor hardening for [PATCH 32/38] arm64: cputype: Add MIDR values for Cavium ThunderX2 [PATCH 33/38] arm: Add BTB invalidation on switch_mm for Cortex-A9, [PATCH 34/38] arm: KVM: Invalidate BTB on guest exit [PATCH 35/38] arm: Add icache invalidation on switch_mm for [PATCH 36/38] arm: KVM: Invalidate icache on guest exit for [PATCH 37/38] arm: Invalidate BTB on prefetch abort outside of user [PATCH 38/38] arm: Invalidate icache on prefetch abort outside of
From: Will Deacon will.deacon@arm.com
In preparation for unmapping the kernel whilst running in userspace, make the kernel mappings non-global so we can avoid expensive TLB invalidation on kernel exit to userspace.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit e046eb0c9bf26d94be9e4592c00c7a78b0fa9bfd) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: skip PTE_RDONLY of PAGE_NONE in arch/arm64/include/asm/pgtable-prot.h --- arch/arm64/include/asm/kernel-pgtable.h | 12 ++++++++++-- arch/arm64/include/asm/pgtable-prot.h | 21 +++++++++++++++------ 2 files changed, 25 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h index 7e51d1b..e4ddac9 100644 --- a/arch/arm64/include/asm/kernel-pgtable.h +++ b/arch/arm64/include/asm/kernel-pgtable.h @@ -71,8 +71,16 @@ /* * Initial memory map attributes. */ -#define SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED) -#define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S) +#define _SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED) +#define _SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S) + +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 +#define SWAPPER_PTE_FLAGS (_SWAPPER_PTE_FLAGS | PTE_NG) +#define SWAPPER_PMD_FLAGS (_SWAPPER_PMD_FLAGS | PMD_SECT_NG) +#else +#define SWAPPER_PTE_FLAGS _SWAPPER_PTE_FLAGS +#define SWAPPER_PMD_FLAGS _SWAPPER_PMD_FLAGS +#endif
#if ARM64_SWAPPER_USES_SECTION_MAPS #define SWAPPER_MM_MMUFLAGS (PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS) diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h index 2142c77..84b5283 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -34,8 +34,16 @@
#include <asm/pgtable-types.h>
-#define PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED) -#define PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S) +#define _PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED) +#define _PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S) + +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 +#define PROT_DEFAULT (_PROT_DEFAULT | PTE_NG) +#define PROT_SECT_DEFAULT (_PROT_SECT_DEFAULT | PMD_SECT_NG) +#else +#define PROT_DEFAULT _PROT_DEFAULT +#define PROT_SECT_DEFAULT _PROT_SECT_DEFAULT +#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
#define PROT_DEVICE_nGnRnE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRnE)) #define PROT_DEVICE_nGnRE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRE)) @@ -48,6 +56,7 @@ #define PROT_SECT_NORMAL_EXEC (PROT_SECT_DEFAULT | PMD_SECT_UXN | PMD_ATTRINDX(MT_NORMAL))
#define _PAGE_DEFAULT (PROT_DEFAULT | PTE_ATTRINDX(MT_NORMAL)) +#define _HYP_PAGE_DEFAULT (_PAGE_DEFAULT & ~PTE_NG)
#define PAGE_KERNEL __pgprot(_PAGE_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE) #define PAGE_KERNEL_RO __pgprot(_PAGE_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_RDONLY) @@ -55,15 +64,15 @@ #define PAGE_KERNEL_EXEC __pgprot(_PAGE_DEFAULT | PTE_UXN | PTE_DIRTY | PTE_WRITE) #define PAGE_KERNEL_EXEC_CONT __pgprot(_PAGE_DEFAULT | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_CONT)
-#define PAGE_HYP __pgprot(_PAGE_DEFAULT | PTE_HYP | PTE_HYP_XN) -#define PAGE_HYP_EXEC __pgprot(_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY) -#define PAGE_HYP_RO __pgprot(_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN) +#define PAGE_HYP __pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_HYP_XN) +#define PAGE_HYP_EXEC __pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY) +#define PAGE_HYP_RO __pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN) #define PAGE_HYP_DEVICE __pgprot(PROT_DEVICE_nGnRE | PTE_HYP)
#define PAGE_S2 __pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY) #define PAGE_S2_DEVICE __pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_UXN)
-#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_PXN | PTE_UXN) +#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_NG | PTE_PXN | PTE_UXN) #define PAGE_SHARED __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE) #define PAGE_SHARED_EXEC __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_WRITE) #define PAGE_COPY __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN)
From: Will Deacon will.deacon@arm.com
In preparation for mapping kernelspace and userspace with different ASIDs, move the ASID to TTBR1 and update switch_mm to context-switch TTBR0 via an invalid mapping (the zero page).
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 7655abb953860485940d4de74fb45a8192149bb6) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: skip pre_ttbr0_update_workaround in arch/arm64/mm/proc.S --- arch/arm64/include/asm/mmu_context.h | 7 +++++++ arch/arm64/include/asm/pgtable-hwdef.h | 1 + arch/arm64/include/asm/proc-fns.h | 6 ------ arch/arm64/mm/proc.S | 9 ++++++--- 4 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index a501853..b96c4799 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -50,6 +50,13 @@ static inline void cpu_set_reserved_ttbr0(void) isb(); }
+static inline void cpu_switch_mm(pgd_t *pgd, struct mm_struct *mm) +{ + BUG_ON(pgd == swapper_pg_dir); + cpu_set_reserved_ttbr0(); + cpu_do_switch_mm(virt_to_phys(pgd),mm); +} + /* * TCR.T0SZ value to use when the ID map is active. Usually equals * TCR_T0SZ(VA_BITS), unless system RAM is positioned very high in diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h index eb0c2bd..8df4cb6 100644 --- a/arch/arm64/include/asm/pgtable-hwdef.h +++ b/arch/arm64/include/asm/pgtable-hwdef.h @@ -272,6 +272,7 @@ #define TCR_TG1_4K (UL(2) << TCR_TG1_SHIFT) #define TCR_TG1_64K (UL(3) << TCR_TG1_SHIFT)
+#define TCR_A1 (UL(1) << 22) #define TCR_ASID16 (UL(1) << 36) #define TCR_TBI0 (UL(1) << 37) #define TCR_HA (UL(1) << 39) diff --git a/arch/arm64/include/asm/proc-fns.h b/arch/arm64/include/asm/proc-fns.h index 14ad6e4..16cef2e 100644 --- a/arch/arm64/include/asm/proc-fns.h +++ b/arch/arm64/include/asm/proc-fns.h @@ -35,12 +35,6 @@ extern u64 cpu_do_resume(phys_addr_t ptr, u64 idmap_ttbr);
#include <asm/memory.h>
-#define cpu_switch_mm(pgd,mm) \ -do { \ - BUG_ON(pgd == swapper_pg_dir); \ - cpu_do_switch_mm(virt_to_phys(pgd),mm); \ -} while (0) - #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* __ASM_PROCFNS_H */ diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S index 352c73b..3378f3e 100644 --- a/arch/arm64/mm/proc.S +++ b/arch/arm64/mm/proc.S @@ -132,9 +132,12 @@ ENDPROC(cpu_do_resume) * - pgd_phys - physical address of new TTB */ ENTRY(cpu_do_switch_mm) + mrs x2, ttbr1_el1 mmid x1, x1 // get mm->context.id - bfi x0, x1, #48, #16 // set the ASID - msr ttbr0_el1, x0 // set TTBR0 + bfi x2, x1, #48, #16 // set the ASID + msr ttbr1_el1, x2 // in TTBR1 (since TCR.A1 is set) + isb + msr ttbr0_el1, x0 // now update TTBR0 isb alternative_if ARM64_WORKAROUND_CAVIUM_27456 ic iallu @@ -222,7 +225,7 @@ ENTRY(__cpu_setup) * both user and kernel. */ ldr x10, =TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \ - TCR_TG_FLAGS | TCR_ASID16 | TCR_TBI0 + TCR_TG_FLAGS | TCR_ASID16 | TCR_TBI0 | TCR_A1 tcr_set_idmap_t0sz x10, x9
/*
From: Will Deacon will.deacon@arm.com
In preparation for separate kernel/user ASIDs, allocate them in pairs for each mm_struct. The bottom bit distinguishes the two: if it is set, then the ASID will map only userspace.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 0c8ea531b7740754cf374ca8b7510655f569c5e3) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: skip MMCF_AARCH32 in arch/arm64/include/asm/mmu.h --- arch/arm64/include/asm/mmu.h | 2 ++ arch/arm64/mm/context.c | 25 +++++++++++++++++-------- 2 files changed, 19 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 8d9fce0..49924e5 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -16,6 +16,8 @@ #ifndef __ASM_MMU_H #define __ASM_MMU_H
+#define USER_ASID_FLAG (UL(1) << 48) + typedef struct { atomic64_t id; void *vdso; diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index efcf1f7..f00f5ee 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -39,7 +39,16 @@ static cpumask_t tlb_flush_pending;
#define ASID_MASK (~GENMASK(asid_bits - 1, 0)) #define ASID_FIRST_VERSION (1UL << asid_bits) -#define NUM_USER_ASIDS ASID_FIRST_VERSION + +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 +#define NUM_USER_ASIDS (ASID_FIRST_VERSION >> 1) +#define asid2idx(asid) (((asid) & ~ASID_MASK) >> 1) +#define idx2asid(idx) (((idx) << 1) & ~ASID_MASK) +#else +#define NUM_USER_ASIDS (ASID_FIRST_VERSION) +#define asid2idx(asid) ((asid) & ~ASID_MASK) +#define idx2asid(idx) asid2idx(idx) +#endif
/* Get the ASIDBits supported by the current CPU */ static u32 get_cpu_asid_bits(void) @@ -104,7 +113,7 @@ static void flush_context(unsigned int cpu) */ if (asid == 0) asid = per_cpu(reserved_asids, i); - __set_bit(asid & ~ASID_MASK, asid_map); + __set_bit(asid2idx(asid), asid_map); per_cpu(reserved_asids, i) = asid; }
@@ -159,16 +168,16 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu) * We had a valid ASID in a previous life, so try to re-use * it if possible. */ - asid &= ~ASID_MASK; - if (!__test_and_set_bit(asid, asid_map)) + if (!__test_and_set_bit(asid2idx(asid), asid_map)) return newasid; }
/* * Allocate a free ASID. If we can't find one, take a note of the - * currently active ASIDs and mark the TLBs as requiring flushes. - * We always count from ASID #1, as we use ASID #0 when setting a - * reserved TTBR0 for the init_mm. + * currently active ASIDs and mark the TLBs as requiring flushes. We + * always count from ASID #2 (index 1), as we use ASID #0 when setting + * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd + * pairs. */ asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, cur_idx); if (asid != NUM_USER_ASIDS) @@ -185,7 +194,7 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu) set_asid: __set_bit(asid, asid_map); cur_idx = asid; - return asid | generation; + return idx2asid(asid) | generation; }
void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
From: Will Deacon will.deacon@arm.com
In order for code such as TLB invalidation to operate efficiently when the decision to map the kernel at EL0 is determined at runtime, this patch introduces a helper function, arm64_kernel_unmapped_at_el0, to determine whether or not the kernel is mapped whilst running in userspace.
Currently, this just reports the value of CONFIG_UNMAP_KERNEL_AT_EL0, but will later be hooked up to a fake CPU capability using a static key.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit fc0e1299da548b32440051f58f08e0c1eb7edd0b) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/include/asm/mmu.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 49924e5..279e75b 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -18,6 +18,8 @@
#define USER_ASID_FLAG (UL(1) << 48)
+#ifndef __ASSEMBLY__ + typedef struct { atomic64_t id; void *vdso; @@ -30,6 +32,11 @@ typedef struct { */ #define ASID(mm) ((mm)->context.id.counter & 0xffff)
+static inline bool arm64_kernel_unmapped_at_el0(void) +{ + return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0); +} + extern void paging_init(void); extern void bootmem_init(void); extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt); @@ -39,4 +46,5 @@ extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, pgprot_t prot, bool allow_block_mappings); extern void *fixmap_remap_fdt(phys_addr_t dt_phys);
+#endif /* !__ASSEMBLY__ */ #endif
From: Will Deacon will.deacon@arm.com
Since an mm has both a kernel and a user ASID, we need to ensure that broadcast TLB maintenance targets both address spaces so that things like CoW continue to work with the uaccess primitives in the kernel.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 9b0de864b5bc298ea53005ad812f3386f81aee9c) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/include/asm/tlbflush.h | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index deab523..ad6bd8b 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -23,6 +23,7 @@
#include <linux/sched.h> #include <asm/cputype.h> +#include <asm/mmu.h>
/* * Raw TLBI operations. @@ -42,6 +43,11 @@
#define __tlbi(op, ...) __TLBI_N(op, ##__VA_ARGS__, 1, 0)
+#define __tlbi_user(op, arg) do { \ + if (arm64_kernel_unmapped_at_el0()) \ + __tlbi(op, (arg) | USER_ASID_FLAG); \ +} while (0) + /* * TLB Management * ============== @@ -103,6 +109,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
dsb(ishst); __tlbi(aside1is, asid); + __tlbi_user(aside1is, asid); dsb(ish); }
@@ -113,6 +120,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
dsb(ishst); __tlbi(vale1is, addr); + __tlbi_user(vale1is, addr); dsb(ish); }
@@ -139,10 +147,13 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
dsb(ishst); for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) { - if (last_level) + if (last_level) { __tlbi(vale1is, addr); - else + __tlbi_user(vale1is, addr); + } else { __tlbi(vae1is, addr); + __tlbi_user(vae1is, addr); + } } dsb(ish); } @@ -182,6 +193,7 @@ static inline void __flush_tlb_pgtable(struct mm_struct *mm, unsigned long addr = uaddr >> 12 | (ASID(mm) << 48);
__tlbi(vae1is, addr); + __tlbi_user(vae1is, addr); dsb(ish); }
From: Mark Rutland mark.rutland@arm.com
In subsequent patches, we will detect stack overflow in our exception entry code, by verifying the SP after it has been decremented to make space for the exception regs.
This verification code is small, and we can minimize its impact by placing it directly in the vectors. To avoid redundant modification of the SP, we also need to move the initial decrement of the SP into the vectors.
As a preparatory step, this patch introduces kernel_ventry, which performs this decrement, and updates the entry code accordingly. Subsequent patches will fold SP verification into kernel_ventry.
There should be no functional change as a result of this patch.
Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org [Mark: turn into prep patch, expand commit msg] Signed-off-by: Mark Rutland mark.rutland@arm.com Reviewed-by: Will Deacon will.deacon@arm.com Tested-by: Laura Abbott labbott@redhat.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: James Morse james.morse@arm.com
(cherry picked from commit b11e5759bfac0c474d95ec4780b1566350e64cad) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/kernel/entry.S | 47 ++++++++++++++++++++++++++--------------------- 1 file changed, 26 insertions(+), 21 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index b4c7db4..f5aa8f0 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -68,8 +68,13 @@ #define BAD_FIQ 2 #define BAD_ERROR 3
- .macro kernel_entry, el, regsize = 64 + .macro kernel_ventry label + .align 7 sub sp, sp, #S_FRAME_SIZE + b \label + .endm + + .macro kernel_entry, el, regsize = 64 .if \regsize == 32 mov w0, w0 // zero upper 32 bits of x0 .endif @@ -257,31 +262,31 @@ tsk .req x28 // current thread_info
.align 11 ENTRY(vectors) - ventry el1_sync_invalid // Synchronous EL1t - ventry el1_irq_invalid // IRQ EL1t - ventry el1_fiq_invalid // FIQ EL1t - ventry el1_error_invalid // Error EL1t + kernel_ventry el1_sync_invalid // Synchronous EL1t + kernel_ventry el1_irq_invalid // IRQ EL1t + kernel_ventry el1_fiq_invalid // FIQ EL1t + kernel_ventry el1_error_invalid // Error EL1t
- ventry el1_sync // Synchronous EL1h - ventry el1_irq // IRQ EL1h - ventry el1_fiq_invalid // FIQ EL1h - ventry el1_error_invalid // Error EL1h + kernel_ventry el1_sync // Synchronous EL1h + kernel_ventry el1_irq // IRQ EL1h + kernel_ventry el1_fiq_invalid // FIQ EL1h + kernel_ventry el1_error_invalid // Error EL1h
- ventry el0_sync // Synchronous 64-bit EL0 - ventry el0_irq // IRQ 64-bit EL0 - ventry el0_fiq_invalid // FIQ 64-bit EL0 - ventry el0_error_invalid // Error 64-bit EL0 + kernel_ventry el0_sync // Synchronous 64-bit EL0 + kernel_ventry el0_irq // IRQ 64-bit EL0 + kernel_ventry el0_fiq_invalid // FIQ 64-bit EL0 + kernel_ventry el0_error_invalid // Error 64-bit EL0
#ifdef CONFIG_COMPAT - ventry el0_sync_compat // Synchronous 32-bit EL0 - ventry el0_irq_compat // IRQ 32-bit EL0 - ventry el0_fiq_invalid_compat // FIQ 32-bit EL0 - ventry el0_error_invalid_compat // Error 32-bit EL0 + kernel_ventry el0_sync_compat // Synchronous 32-bit EL0 + kernel_ventry el0_irq_compat // IRQ 32-bit EL0 + kernel_ventry el0_fiq_invalid_compat // FIQ 32-bit EL0 + kernel_ventry el0_error_invalid_compat // Error 32-bit EL0 #else - ventry el0_sync_invalid // Synchronous 32-bit EL0 - ventry el0_irq_invalid // IRQ 32-bit EL0 - ventry el0_fiq_invalid // FIQ 32-bit EL0 - ventry el0_error_invalid // Error 32-bit EL0 + kernel_ventry el0_sync_invalid // Synchronous 32-bit EL0 + kernel_ventry el0_irq_invalid // IRQ 32-bit EL0 + kernel_ventry el0_fiq_invalid // FIQ 32-bit EL0 + kernel_ventry el0_error_invalid // Error 32-bit EL0 #endif END(vectors)
From: Xie XiuQi xiexiuqi@huawei.com
Today SError is taken using the inv_entry macro that ends up in bad_mode.
SError can be used by the RAS Extensions to notify either the OS or firmware of CPU problems, some of which may have been corrected.
To allow this handling to be added, add a do_serror() C function that just panic()s. Add the entry.S boiler plate to save/restore the CPU registers and unmask debug exceptions. Future patches may change do_serror() to return if the SError Interrupt was notification of a corrected error.
Signed-off-by: Xie XiuQi xiexiuqi@huawei.com Signed-off-by: Wang Xiongfeng wangxiongfengi2@huawei.com [Split out of a bigger patch, added compat path, renamed, enabled debug exceptions] Signed-off-by: James Morse james.morse@arm.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Will Deacon will.deacon@arm.com
(cherry picked from commit a92d4d1454ab8b43b80b89fa31fcedb8821f8164) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: skip vmap_stack in arch/arm64/kernel/traps.c and use old enable_dbg_and_irq instead of enable_daif in arch/arm64/kernel/entry.S --- arch/arm64/kernel/entry.S | 36 +++++++++++++++++++++++++++++------- arch/arm64/kernel/traps.c | 14 ++++++++++++++ 2 files changed, 43 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index f5aa8f0..60b202a 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -270,18 +270,18 @@ ENTRY(vectors) kernel_ventry el1_sync // Synchronous EL1h kernel_ventry el1_irq // IRQ EL1h kernel_ventry el1_fiq_invalid // FIQ EL1h - kernel_ventry el1_error_invalid // Error EL1h + kernel_ventry el1_error // Error EL1h
kernel_ventry el0_sync // Synchronous 64-bit EL0 kernel_ventry el0_irq // IRQ 64-bit EL0 kernel_ventry el0_fiq_invalid // FIQ 64-bit EL0 - kernel_ventry el0_error_invalid // Error 64-bit EL0 + kernel_ventry el0_error // Error 64-bit EL0
#ifdef CONFIG_COMPAT kernel_ventry el0_sync_compat // Synchronous 32-bit EL0 kernel_ventry el0_irq_compat // IRQ 32-bit EL0 kernel_ventry el0_fiq_invalid_compat // FIQ 32-bit EL0 - kernel_ventry el0_error_invalid_compat // Error 32-bit EL0 + kernel_ventry el0_error_compat // Error 32-bit EL0 #else kernel_ventry el0_sync_invalid // Synchronous 32-bit EL0 kernel_ventry el0_irq_invalid // IRQ 32-bit EL0 @@ -321,10 +321,6 @@ ENDPROC(el0_error_invalid) el0_fiq_invalid_compat: inv_entry 0, BAD_FIQ, 32 ENDPROC(el0_fiq_invalid_compat) - -el0_error_invalid_compat: - inv_entry 0, BAD_ERROR, 32 -ENDPROC(el0_error_invalid_compat) #endif
el1_sync_invalid: @@ -532,6 +528,10 @@ el0_svc_compat: el0_irq_compat: kernel_entry 0, 32 b el0_irq_naked + +el0_error_compat: + kernel_entry 0, 32 + b el0_error_naked #endif
el0_da: @@ -653,6 +653,28 @@ el0_irq_naked: b ret_to_user ENDPROC(el0_irq)
+el1_error: + kernel_entry 1 + mrs x1, esr_el1 + enable_dbg + mov x0, sp + bl do_serror + kernel_exit 1 +ENDPROC(el1_error) + +el0_error: + kernel_entry 0 +el0_error_naked: + mrs x1, esr_el1 + enable_dbg + mov x0, sp + bl do_serror + enable_dbg_and_irq + ct_user_exit + b ret_to_user +ENDPROC(el0_error) + + /* * Register switch for AArch64. The callee-saved registers need to be saved * and restored. On entry: diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index c743d1f..2ef7e33 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -637,6 +637,20 @@ asmlinkage void bad_el0_sync(struct pt_regs *regs, int reason, unsigned int esr) force_sig_info(info.si_signo, &info, current); }
+ +asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr) +{ + nmi_enter(); + + console_verbose(); + + pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n", + smp_processor_id(), esr, esr_get_class_string(esr)); + __show_regs(regs); + + panic("Asynchronous SError Interrupt"); +} + void __pte_error(const char *file, int line, unsigned long val) { pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
From: AKASHI Takahiro takahiro.akashi@linaro.org
The current "rodata=off" parameter disables read-only kernel mappings under CONFIG_DEBUG_RODATA: commit d2aa1acad22f ("mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings")
This patch is a logical extension to module mappings ie. read-only mappings at module loading can be disabled even if CONFIG_DEBUG_SET_MODULE_RONX (mainly for debug use). Please note, however, that it only affects RO/RW permissions, keeping NX set.
This is the first step to make CONFIG_DEBUG_SET_MODULE_RONX mandatory (always-on) in the future as CONFIG_DEBUG_RODATA on x86 and arm64.
Suggested-by: and Acked-by: Mark Rutland mark.rutland@arm.com Signed-off-by: AKASHI Takahiro takahiro.akashi@linaro.org Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Rusty Russell rusty@rustcorp.com.au Link: http://lkml.kernel.org/r/20161114061505.15238-1-takahiro.akashi@linaro.org Signed-off-by: Jessica Yu jeyu@redhat.com (cherry picked from commit 39290b389ea2654f9190e3b48c57d27b24def83e) Signed-off-by: Alex Shi alex.shi@linaro.org --- include/linux/init.h | 3 +++ init/main.c | 7 +++++-- kernel/module.c | 20 +++++++++++++++++--- 3 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/include/linux/init.h b/include/linux/init.h index e30104c..885c3e6 100644 --- a/include/linux/init.h +++ b/include/linux/init.h @@ -126,6 +126,9 @@ void prepare_namespace(void); void __init load_default_modules(void); int __init init_rootfs(void);
+#if defined(CONFIG_DEBUG_RODATA) || defined(CONFIG_DEBUG_SET_MODULE_RONX) +extern bool rodata_enabled; +#endif #ifdef CONFIG_DEBUG_RODATA void mark_rodata_ro(void); #endif diff --git a/init/main.c b/init/main.c index 99f0265..f22957a 100644 --- a/init/main.c +++ b/init/main.c @@ -81,6 +81,7 @@ #include <linux/proc_ns.h> #include <linux/io.h> #include <linux/kaiser.h> +#include <linux/cache.h>
#include <asm/io.h> #include <asm/bugs.h> @@ -914,14 +915,16 @@ static int try_to_run_init_process(const char *init_filename)
static noinline void __init kernel_init_freeable(void);
-#ifdef CONFIG_DEBUG_RODATA -static bool rodata_enabled = true; +#if defined(CONFIG_DEBUG_RODATA) || defined(CONFIG_SET_MODULE_RONX) +bool rodata_enabled __ro_after_init = true; static int __init set_debug_rodata(char *str) { return strtobool(str, &rodata_enabled); } __setup("rodata=", set_debug_rodata); +#endif
+#ifdef CONFIG_DEBUG_RODATA static void mark_readonly(void) { if (rodata_enabled) diff --git a/kernel/module.c b/kernel/module.c index 0e54d5b..938b84f 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -1911,6 +1911,9 @@ static void frob_writable_data(const struct module_layout *layout, /* livepatching wants to disable read-only so it can frob module. */ void module_disable_ro(const struct module *mod) { + if (!rodata_enabled) + return; + frob_text(&mod->core_layout, set_memory_rw); frob_rodata(&mod->core_layout, set_memory_rw); frob_ro_after_init(&mod->core_layout, set_memory_rw); @@ -1920,6 +1923,9 @@ void module_disable_ro(const struct module *mod)
void module_enable_ro(const struct module *mod, bool after_init) { + if (!rodata_enabled) + return; + frob_text(&mod->core_layout, set_memory_ro); frob_rodata(&mod->core_layout, set_memory_ro); frob_text(&mod->init_layout, set_memory_ro); @@ -1952,6 +1958,9 @@ void set_all_modules_text_rw(void) { struct module *mod;
+ if (!rodata_enabled) + return; + mutex_lock(&module_mutex); list_for_each_entry_rcu(mod, &modules, list) { if (mod->state == MODULE_STATE_UNFORMED) @@ -1968,6 +1977,9 @@ void set_all_modules_text_ro(void) { struct module *mod;
+ if (!rodata_enabled) + return; + mutex_lock(&module_mutex); list_for_each_entry_rcu(mod, &modules, list) { if (mod->state == MODULE_STATE_UNFORMED) @@ -1981,10 +1993,12 @@ void set_all_modules_text_ro(void)
static void disable_ro_nx(const struct module_layout *layout) { - frob_text(layout, set_memory_rw); - frob_rodata(layout, set_memory_rw); + if (rodata_enabled) { + frob_text(layout, set_memory_rw); + frob_rodata(layout, set_memory_rw); + frob_ro_after_init(layout, set_memory_rw); + } frob_rodata(layout, set_memory_x); - frob_ro_after_init(layout, set_memory_rw); frob_ro_after_init(layout, set_memory_x); frob_writable_data(layout, set_memory_x); }
From: Will Deacon will.deacon@arm.com
To allow unmapping of the kernel whilst running at EL0, we need to point the exception vectors at an entry trampoline that can map/unmap the kernel on entry/exit respectively.
This patch adds the trampoline page, although it is not yet plugged into the vector table and is therefore unused.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit c7b9adaf85f818d747eeff5145eb4095ccd587fb) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: add asm/mmu.h in entry.S for ASID marco add kernel-pgtable.h in entry.S for SWAPPER_DIR_SIZE and RESERVED_TTBR0_SIZE force add tramp_pg_dir in vmlinux.lds.S --- arch/arm64/include/asm/kernel-pgtable.h | 2 + arch/arm64/kernel/entry.S | 86 +++++++++++++++++++++++++++++++++ arch/arm64/kernel/vmlinux.lds.S | 16 ++++++ 3 files changed, 104 insertions(+)
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h index e4ddac9..135e829 100644 --- a/arch/arm64/include/asm/kernel-pgtable.h +++ b/arch/arm64/include/asm/kernel-pgtable.h @@ -54,6 +54,8 @@ #define SWAPPER_DIR_SIZE (SWAPPER_PGTABLE_LEVELS * PAGE_SIZE) #define IDMAP_DIR_SIZE (IDMAP_PGTABLE_LEVELS * PAGE_SIZE)
+#define RESERVED_TTBR0_SIZE (0) /*no CONFIG_ARM64_SW_TTBR0_PAN introduced */ + /* Initial memory map size */ #if ARM64_SWAPPER_USES_SECTION_MAPS #define SWAPPER_BLOCK_SHIFT SECTION_SHIFT diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 60b202a..669f1c6 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -29,6 +29,8 @@ #include <asm/esr.h> #include <asm/irq.h> #include <asm/memory.h> +#include <asm/mmu.h> +#include <asm/kernel-pgtable.h> #include <asm/thread_info.h> #include <asm/asm-uaccess.h> #include <asm/unistd.h> @@ -828,6 +830,90 @@ __ni_sys_trace:
.popsection // .entry.text
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 +/* + * Exception vectors trampoline. + */ + .pushsection ".entry.tramp.text", "ax" + + .macro tramp_map_kernel, tmp + mrs \tmp, ttbr1_el1 + sub \tmp, \tmp, #(SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE) + bic \tmp, \tmp, #USER_ASID_FLAG + msr ttbr1_el1, \tmp + .endm + + .macro tramp_unmap_kernel, tmp + mrs \tmp, ttbr1_el1 + add \tmp, \tmp, #(SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE) + orr \tmp, \tmp, #USER_ASID_FLAG + msr ttbr1_el1, \tmp + /* + * We avoid running the post_ttbr_update_workaround here because the + * user and kernel ASIDs don't have conflicting mappings, so any + * "blessing" as described in: + * + * http://lkml.kernel.org/r/56BB848A.6060603@caviumnetworks.com + * + * will not hurt correctness. Whilst this may partially defeat the + * point of using split ASIDs in the first place, it avoids + * the hit of invalidating the entire I-cache on every return to + * userspace. + */ + .endm + + .macro tramp_ventry, regsize = 64 + .align 7 +1: + .if \regsize == 64 + msr tpidrro_el0, x30 // Restored in kernel_ventry + .endif + tramp_map_kernel x30 + ldr x30, =vectors + prfm plil1strm, [x30, #(1b - tramp_vectors)] + msr vbar_el1, x30 + add x30, x30, #(1b - tramp_vectors) + isb + br x30 + .endm + + .macro tramp_exit, regsize = 64 + adr x30, tramp_vectors + msr vbar_el1, x30 + tramp_unmap_kernel x30 + .if \regsize == 64 + mrs x30, far_el1 + .endif + eret + .endm + + .align 11 +ENTRY(tramp_vectors) + .space 0x400 + + tramp_ventry + tramp_ventry + tramp_ventry + tramp_ventry + + tramp_ventry 32 + tramp_ventry 32 + tramp_ventry 32 + tramp_ventry 32 +END(tramp_vectors) + +ENTRY(tramp_exit_native) + tramp_exit +END(tramp_exit_native) + +ENTRY(tramp_exit_compat) + tramp_exit 32 +END(tramp_exit_compat) + + .ltorg + .popsection // .entry.tramp.text +#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ + /* * Special system call wrappers. */ diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 1105aab..9b8c89d 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -56,6 +56,17 @@ jiffies = jiffies_64; #define HIBERNATE_TEXT #endif
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 +#define TRAMP_TEXT \ + . = ALIGN(PAGE_SIZE); \ + VMLINUX_SYMBOL(__entry_tramp_text_start) = .; \ + *(.entry.tramp.text) \ + . = ALIGN(PAGE_SIZE); \ + VMLINUX_SYMBOL(__entry_tramp_text_end) = .; +#else +#define TRAMP_TEXT +#endif + /* * The size of the PE/COFF section that covers the kernel image, which * runs from stext to _edata, must be a round multiple of the PE/COFF @@ -128,6 +139,7 @@ SECTIONS HYPERVISOR_TEXT IDMAP_TEXT HIBERNATE_TEXT + TRAMP_TEXT *(.fixup) *(.gnu.warning) . = ALIGN(16); @@ -216,6 +228,10 @@ SECTIONS swapper_pg_dir = .; . += SWAPPER_DIR_SIZE;
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 + tramp_pg_dir = .; + . += PAGE_SIZE; +#endif _end = .;
STABS_DEBUG
From: Will Deacon will.deacon@arm.com
The exception entry trampoline needs to be mapped at the same virtual address in both the trampoline page table (which maps nothing else) and also the kernel page table, so that we can swizzle TTBR1_EL1 on exceptions from and return to EL0.
This patch maps the trampoline at a fixed virtual address in the fixmap area of the kernel virtual address space, which allows the kernel proper to be randomized with respect to the trampoline when KASLR is enabled.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 51a0048beb449682d632d0af52a515adb9f9882e) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/include/asm/fixmap.h | 5 +++++ arch/arm64/include/asm/pgtable.h | 1 + arch/arm64/kernel/asm-offsets.c | 6 +++++- arch/arm64/mm/mmu.c | 24 ++++++++++++++++++++++++ 4 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h index caf86be..7b1d88c 100644 --- a/arch/arm64/include/asm/fixmap.h +++ b/arch/arm64/include/asm/fixmap.h @@ -51,6 +51,11 @@ enum fixed_addresses {
FIX_EARLYCON_MEM_BASE, FIX_TEXT_POKE0, + +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 + FIX_ENTRY_TRAMP_TEXT, +#define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT)) +#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ __end_of_permanent_fixed_addresses,
/* diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 7acd3c5..3a30a39 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -692,6 +692,7 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm,
extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; extern pgd_t idmap_pg_dir[PTRS_PER_PGD]; +extern pgd_t tramp_pg_dir[PTRS_PER_PGD];
/* * Encode and decode a swap entry: diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index c58ddf8..5f4bf3c 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -24,6 +24,7 @@ #include <linux/kvm_host.h> #include <linux/suspend.h> #include <asm/cpufeature.h> +#include <asm/fixmap.h> #include <asm/thread_info.h> #include <asm/memory.h> #include <asm/smp_plat.h> @@ -144,11 +145,14 @@ int main(void) DEFINE(ARM_SMCCC_RES_X2_OFFS, offsetof(struct arm_smccc_res, a2)); DEFINE(ARM_SMCCC_QUIRK_ID_OFFS, offsetof(struct arm_smccc_quirk, id)); DEFINE(ARM_SMCCC_QUIRK_STATE_OFFS, offsetof(struct arm_smccc_quirk, state)); - BLANK(); DEFINE(HIBERN_PBE_ORIG, offsetof(struct pbe, orig_address)); DEFINE(HIBERN_PBE_ADDR, offsetof(struct pbe, address)); DEFINE(HIBERN_PBE_NEXT, offsetof(struct pbe, next)); DEFINE(ARM64_FTR_SYSVAL, offsetof(struct arm64_ftr_reg, sys_val)); + BLANK(); +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 + DEFINE(TRAMP_VALIAS, TRAMP_VALIAS); +#endif return 0; } diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 05615a3..f6211c7 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -419,6 +419,30 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end, vm_area_add_early(vma); }
+ +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 +static int __init map_entry_trampoline(void) +{ + extern char __entry_tramp_text_start[]; + + pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC; + phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start); + + /* The trampoline is always mapped and can therefore be global */ + pgprot_val(prot) &= ~PTE_NG; + + /* Map only the text into the trampoline page table */ + memset(tramp_pg_dir, 0, PGD_SIZE); + __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE, + prot, pgd_pgtable_alloc, 0); + + /* ...as well as the kernel page table */ + __set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot); + return 0; +} +core_initcall(map_entry_trampoline); +#endif + /* * Create fine-grained mappings for the kernel. */
From: Will Deacon will.deacon@arm.com
We will need to treat exceptions from EL0 differently in kernel_ventry, so rework the macro to take the exception level as an argument and construct the branch target using that.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 5b1f7fe41909cde40decad9f0e8ee585777a0538) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: skip vmap_stack in arch/arm64/kernel/entry.S --- arch/arm64/kernel/entry.S | 44 ++++++++++++++++++++++---------------------- 1 file changed, 22 insertions(+), 22 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 669f1c6..1176a73 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -70,10 +70,10 @@ #define BAD_FIQ 2 #define BAD_ERROR 3
- .macro kernel_ventry label + .macro kernel_ventry, el, label, regsize = 64 .align 7 sub sp, sp, #S_FRAME_SIZE - b \label + b el()\el()_\label .endm
.macro kernel_entry, el, regsize = 64 @@ -264,31 +264,31 @@ tsk .req x28 // current thread_info
.align 11 ENTRY(vectors) - kernel_ventry el1_sync_invalid // Synchronous EL1t - kernel_ventry el1_irq_invalid // IRQ EL1t - kernel_ventry el1_fiq_invalid // FIQ EL1t - kernel_ventry el1_error_invalid // Error EL1t + kernel_ventry 1, sync_invalid // Synchronous EL1t + kernel_ventry 1, irq_invalid // IRQ EL1t + kernel_ventry 1, fiq_invalid // FIQ EL1t + kernel_ventry 1, error_invalid // Error EL1t
- kernel_ventry el1_sync // Synchronous EL1h - kernel_ventry el1_irq // IRQ EL1h - kernel_ventry el1_fiq_invalid // FIQ EL1h - kernel_ventry el1_error // Error EL1h + kernel_ventry 1, sync // Synchronous EL1h + kernel_ventry 1, irq // IRQ EL1h + kernel_ventry 1, fiq_invalid // FIQ EL1h + kernel_ventry 1, error // Error EL1h
- kernel_ventry el0_sync // Synchronous 64-bit EL0 - kernel_ventry el0_irq // IRQ 64-bit EL0 - kernel_ventry el0_fiq_invalid // FIQ 64-bit EL0 - kernel_ventry el0_error // Error 64-bit EL0 + kernel_ventry 0, sync // Synchronous 64-bit EL0 + kernel_ventry 0, irq // IRQ 64-bit EL0 + kernel_ventry 0, fiq_invalid // FIQ 64-bit EL0 + kernel_ventry 0, error // Error 64-bit EL0
#ifdef CONFIG_COMPAT - kernel_ventry el0_sync_compat // Synchronous 32-bit EL0 - kernel_ventry el0_irq_compat // IRQ 32-bit EL0 - kernel_ventry el0_fiq_invalid_compat // FIQ 32-bit EL0 - kernel_ventry el0_error_compat // Error 32-bit EL0 + kernel_ventry 0, sync_compat, 32 // Synchronous 32-bit EL0 + kernel_ventry 0, irq_compat, 32 // IRQ 32-bit EL0 + kernel_ventry 0, fiq_invalid_compat, 32 // FIQ 32-bit EL0 + kernel_ventry 0, error_compat, 32 // Error 32-bit EL0 #else - kernel_ventry el0_sync_invalid // Synchronous 32-bit EL0 - kernel_ventry el0_irq_invalid // IRQ 32-bit EL0 - kernel_ventry el0_fiq_invalid // FIQ 32-bit EL0 - kernel_ventry el0_error_invalid // Error 32-bit EL0 + kernel_ventry 0, sync_invalid, 32 // Synchronous 32-bit EL0 + kernel_ventry 0, irq_invalid, 32 // IRQ 32-bit EL0 + kernel_ventry 0, fiq_invalid, 32 // FIQ 32-bit EL0 + kernel_ventry 0, error_invalid, 32 // Error 32-bit EL0 #endif END(vectors)
From: Will Deacon will.deacon@arm.com
Hook up the entry trampoline to our exception vectors so that all exceptions from and returns to EL0 go via the trampoline, which swizzles the vector base register accordingly. Transitioning to and from the kernel clobbers x30, so we use tpidrro_el0 and far_el1 as scratch registers for native tasks.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 4bf3286d29f3a88425d8d8cd53428cbb8f865f04) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/kernel/entry.S | 39 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 36 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 1176a73..fbfee14 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -72,10 +72,26 @@
.macro kernel_ventry, el, label, regsize = 64 .align 7 +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 + .if \el == 0 + .if \regsize == 64 + mrs x30, tpidrro_el0 + msr tpidrro_el0, xzr + .else + mov x30, xzr + .endif + .endif +#endif + sub sp, sp, #S_FRAME_SIZE b el()\el()_\label .endm
+ .macro tramp_alias, dst, sym + mov_q \dst, TRAMP_VALIAS + add \dst, \dst, #(\sym - .entry.tramp.text) + .endm + .macro kernel_entry, el, regsize = 64 .if \regsize == 32 mov w0, w0 // zero upper 32 bits of x0 @@ -157,18 +173,20 @@ ct_user_enter ldr x23, [sp, #S_SP] // load return stack pointer msr sp_el0, x23 + tst x22, #PSR_MODE32_BIT // native task? + b.eq 3f + #ifdef CONFIG_ARM64_ERRATUM_845719 alternative_if ARM64_WORKAROUND_845719 - tbz x22, #4, 1f #ifdef CONFIG_PID_IN_CONTEXTIDR mrs x29, contextidr_el1 msr contextidr_el1, x29 #else msr contextidr_el1, xzr #endif -1: alternative_else_nop_endif #endif +3: .endif msr elr_el1, x21 // set up the return data msr spsr_el1, x22 @@ -189,7 +207,22 @@ alternative_else_nop_endif ldp x28, x29, [sp, #16 * 14] ldr lr, [sp, #S_LR] add sp, sp, #S_FRAME_SIZE // restore sp - eret // return to kernel + +#ifndef CONFIG_UNMAP_KERNEL_AT_EL0 + eret +#else + .if \el == 0 + bne 4f + msr far_el1, x30 + tramp_alias x30, tramp_exit_native + br x30 +4: + tramp_alias x30, tramp_exit_compat + br x30 + .else + eret + .endif +#endif .endm
.macro get_thread_info, rd
From: Will Deacon will.deacon@arm.com
When unmapping the kernel at EL0, we use tpidrro_el0 as a scratch register during exception entry from native tasks and subsequently zero it in the kernel_ventry macro. We can therefore avoid zeroing tpidrro_el0 in the context-switch path for native tasks using the entry trampoline.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 18011eac28c7cb31c87b86b7d0e5b01894405c7f) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: keep fold tls_preserve_current_state() in arch/arm64/kernel/process.c --- arch/arm64/kernel/process.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 0e73949..0972ce5 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -306,17 +306,17 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
static void tls_thread_switch(struct task_struct *next) { - unsigned long tpidr, tpidrro; + unsigned long tpidr;
tpidr = read_sysreg(tpidr_el0); *task_user_tls(current) = tpidr;
- tpidr = *task_user_tls(next); - tpidrro = is_compat_thread(task_thread_info(next)) ? - next->thread.tp_value : 0; + if (is_compat_thread(task_thread_info(next))) + write_sysreg(next->thread.tp_value, tpidrro_el0); + else if (!arm64_kernel_unmapped_at_el0()) + write_sysreg(0, tpidrro_el0);
- write_sysreg(tpidr, tpidr_el0); - write_sysreg(tpidrro, tpidrro_el0); + write_sysreg(*task_user_tls(next), tpidr_el0); }
/* Restore the UAO state depending on next's addr_limit */
From: Will Deacon will.deacon@arm.com
Allow explicit disabling of the entry trampoline on the kernel command line (kpti=off) by adding a fake CPU feature (ARM64_UNMAP_KERNEL_AT_EL0) that can be used to toggle the alternative sequences in our entry code and avoid use of the trampoline altogether if desired. This also allows us to make use of a static key in arm64_kernel_unmapped_at_el0().
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit ea1e3de85e94d711f63437c04624aa0e8de5c8b3) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: skip much cpu feautres in files: arch/arm64/include/asm/cpucaps.h arch/arm64/kernel/cpufeature.c and use old cpus_have_cap instead of cpus_have_const_cap in arch/arm64/include/asm/mmu.h --- arch/arm64/include/asm/cpucaps.h | 4 ++-- arch/arm64/include/asm/mmu.h | 3 ++- arch/arm64/kernel/cpufeature.c | 41 ++++++++++++++++++++++++++++++++++++++++ arch/arm64/kernel/entry.S | 9 +++++---- 4 files changed, 50 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 87b4465..ec601db2 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -34,7 +34,7 @@ #define ARM64_HAS_32BIT_EL0 13 #define ARM64_HYP_OFFSET_LOW 14 #define ARM64_MISMATCHED_CACHE_LINE_SIZE 15 - -#define ARM64_NCAPS 16 +#define ARM64_UNMAP_KERNEL_AT_EL0 16 +#define ARM64_NCAPS 17
#endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 279e75b..a813edf 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -34,7 +34,8 @@ typedef struct {
static inline bool arm64_kernel_unmapped_at_el0(void) { - return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0); + return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0) && + cpus_have_cap(ARM64_UNMAP_KERNEL_AT_EL0); }
extern void paging_init(void); diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 3a129d4..74b168c 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -746,6 +746,40 @@ static bool hyp_offset_low(const struct arm64_cpu_capabilities *entry, return idmap_addr > GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode(); }
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 +static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */ + +static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry, + int __unused) +{ + /* Forced on command line? */ + if (__kpti_forced) { + pr_info_once("kernel page table isolation forced %s by command line option\n", + __kpti_forced > 0 ? "ON" : "OFF"); + return __kpti_forced > 0; + } + + /* Useful for KASLR robustness */ + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) + return true; + + return false; +} + +static int __init parse_kpti(char *str) +{ + bool enabled; + int ret = strtobool(str, &enabled); + + if (ret) + return ret; + + __kpti_forced = enabled ? 1 : -1; + return 0; +} +__setup("kpti=", parse_kpti); +#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ + static const struct arm64_cpu_capabilities arm64_features[] = { { .desc = "GIC system register CPU interface", @@ -829,6 +863,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .def_scope = SCOPE_SYSTEM, .matches = hyp_offset_low, }, +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 + { + .capability = ARM64_UNMAP_KERNEL_AT_EL0, + .def_scope = SCOPE_SYSTEM, + .matches = unmap_kernel_at_el0, + }, +#endif {}, };
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index fbfee14..603faa1 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -73,6 +73,7 @@ .macro kernel_ventry, el, label, regsize = 64 .align 7 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 +alternative_if ARM64_UNMAP_KERNEL_AT_EL0 .if \el == 0 .if \regsize == 64 mrs x30, tpidrro_el0 @@ -81,6 +82,7 @@ mov x30, xzr .endif .endif +alternative_else_nop_endif #endif
sub sp, sp, #S_FRAME_SIZE @@ -208,10 +210,9 @@ alternative_else_nop_endif ldr lr, [sp, #S_LR] add sp, sp, #S_FRAME_SIZE // restore sp
-#ifndef CONFIG_UNMAP_KERNEL_AT_EL0 - eret -#else .if \el == 0 +alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0 +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 bne 4f msr far_el1, x30 tramp_alias x30, tramp_exit_native @@ -219,10 +220,10 @@ alternative_else_nop_endif 4: tramp_alias x30, tramp_exit_compat br x30 +#endif .else eret .endif -#endif .endm
.macro get_thread_info, rd
From: Will Deacon will.deacon@arm.com
Add a Kconfig entry to control use of the entry trampoline, which allows us to unmap the kernel whilst running in userspace and improve the robustness of KASLR.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 084eb77cd3a81134d02500977dc0ecc9277dc97d) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/Kconfig | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index cf57a77..f660eaa 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -733,6 +733,19 @@ config FORCE_MAX_ZONEORDER However for 4K, we choose a higher default value, 11 as opposed to 10, giving us 4M allocations matching the default size used by generic code.
+config UNMAP_KERNEL_AT_EL0 + bool "Unmap kernel when running in userspace (aka "KAISER")" + default y + help + Some attacks against KASLR make use of the timing difference between + a permission fault which could arise from a page table entry that is + present in the TLB, and a translation fault which always requires a + page table walk. This option defends against these attacks by unmapping + the kernel whilst running in userspace, therefore forcing translation + faults for all of kernel space. + + If unsure, say Y. + menuconfig ARMV8_DEPRECATED bool "Emulate deprecated/obsolete ARMv8 instructions" depends on COMPAT
From: Will Deacon will.deacon@arm.com
The literal pool entry for identifying the vectors base is the only piece of information in the trampoline page that identifies the true location of the kernel.
This patch moves it into a page-aligned region of the .rodata section and maps this adjacent to the trampoline text via an additional fixmap entry, which protects against any accidental leakage of the trampoline contents.
Suggested-by: Ard Biesheuvel ard.biesheuvel@linaro.org Tested-by: Laura Abbott labbott@redhat.com Tested-by: Shanker Donthineni shankerd@codeaurora.org Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 6c27c4082f4f70b9f41df4d0adf51128b40351df) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: skip ARM64_WORKAROUND_QCOM_FALKOR_E1003 in entry.S --- arch/arm64/include/asm/fixmap.h | 1 + arch/arm64/kernel/entry.S | 13 +++++++++++++ arch/arm64/kernel/vmlinux.lds.S | 5 ++++- arch/arm64/mm/mmu.c | 10 +++++++++- 4 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h index 7b1d88c..d8e5805 100644 --- a/arch/arm64/include/asm/fixmap.h +++ b/arch/arm64/include/asm/fixmap.h @@ -53,6 +53,7 @@ enum fixed_addresses { FIX_TEXT_POKE0,
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 + FIX_ENTRY_TRAMP_DATA, FIX_ENTRY_TRAMP_TEXT, #define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT)) #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 603faa1..69e6d1a 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -903,7 +903,12 @@ __ni_sys_trace: msr tpidrro_el0, x30 // Restored in kernel_ventry .endif tramp_map_kernel x30 +#ifdef CONFIG_RANDOMIZE_BASE + adr x30, tramp_vectors + PAGE_SIZE + ldr x30, [x30] +#else ldr x30, =vectors +#endif prfm plil1strm, [x30, #(1b - tramp_vectors)] msr vbar_el1, x30 add x30, x30, #(1b - tramp_vectors) @@ -946,6 +951,14 @@ END(tramp_exit_compat)
.ltorg .popsection // .entry.tramp.text +#ifdef CONFIG_RANDOMIZE_BASE + .pushsection ".rodata", "a" + .align PAGE_SHIFT + .globl __entry_tramp_data_start +__entry_tramp_data_start: + .quad vectors + .popsection // .rodata +#endif /* CONFIG_RANDOMIZE_BASE */ #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
/* diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 9b8c89d..e5678fd 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -251,7 +251,10 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K, ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1)) <= SZ_4K, "Hibernate exit text too big or misaligned") #endif - +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 +ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE, + "Entry trampoline text too big") +#endif /* * If padding is applied before .head.text, virt<->phys conversions will fail. */ diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index f6211c7..0a462e9 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -436,8 +436,16 @@ static int __init map_entry_trampoline(void) __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE, prot, pgd_pgtable_alloc, 0);
- /* ...as well as the kernel page table */ + /* Map both the text and data into the kernel page table */ __set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot); + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { + extern char __entry_tramp_data_start[]; + + __set_fixmap(FIX_ENTRY_TRAMP_DATA, + __pa_symbol(__entry_tramp_data_start), + PAGE_KERNEL_RO); + } + return 0; } core_initcall(map_entry_trampoline);
From: Will Deacon will.deacon@arm.com
In order to invoke the CPU capability ->matches callback from the ->enable callback for applying local-CPU workarounds, we need a handle on the capability structure.
This patch passes a pointer to the capability structure to the ->enable callback.
Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit e3af87ceae9e5bcda2e8cdbccbe517a442994bba) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: arch/arm64/kernel/cpufeature.c --- arch/arm64/kernel/cpufeature.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 74b168c..2bf7e03 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -990,7 +990,7 @@ void __init enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps) * uses an IPI, giving us a PSTATE that disappears when * we return. */ - stop_machine(caps->enable, NULL, cpu_online_mask); + stop_machine(caps->enable, (void *)caps, cpu_online_mask); }
/* @@ -1046,7 +1046,7 @@ verify_local_cpu_features(const struct arm64_cpu_capabilities *caps) cpu_die_early(); } if (caps->enable) - caps->enable(NULL); + caps->enable((void *)caps); } }
From: Marc Zyngier marc.zyngier@arm.com
this_cpu_has_cap() only checks the feature array, and not the errata one. In order to be able to check for a CPU-local erratum, allow it to inspect the latter as well.
This is consistent with cpus_have_cap()'s behaviour, which includes errata already.
Acked-by: Thomas Gleixner tglx@linutronix.de Acked-by: Daniel Lezcano daniel.lezcano@linaro.org Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com (cherry picked from commit 8f4137588261d7504f4aa022dc9d1a1fd1940e8e) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/kernel/cpufeature.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 2bf7e03..86f4af2 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1097,20 +1097,29 @@ static void __init setup_feature_capabilities(void) * Check if the current CPU has a given feature capability. * Should be called from non-preemptible context. */ -bool this_cpu_has_cap(unsigned int cap) +static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array, + unsigned int cap) { const struct arm64_cpu_capabilities *caps;
if (WARN_ON(preemptible())) return false;
- for (caps = arm64_features; caps->desc; caps++) + for (caps = cap_array; caps->desc; caps++) if (caps->capability == cap && caps->matches) return caps->matches(caps, SCOPE_LOCAL_CPU);
return false; }
+extern const struct arm64_cpu_capabilities arm64_errata[]; + +bool this_cpu_has_cap(unsigned int cap) +{ + return (__this_cpu_has_cap(arm64_features, cap) || + __this_cpu_has_cap(arm64_errata, cap)); +} + void __init setup_cpu_features(void) { u32 cwg;
From: Suzuki K Poulose suzuki.poulose@arm.com
Sometimes a single capability could be listed multiple times with differing matches(), e.g, CPU errata for different MIDR versions. This breaks verify_local_cpu_feature() and this_cpu_has_cap() as we stop checking for a capability on a CPU with the first entry in the given table, which is not sufficient. Make sure we run the checks for all entries of the same capability. We do this by fixing __this_cpu_has_cap() to run through all the entries in the given table for a match and reuse it for verify_local_cpu_feature().
Cc: Mark Rutland mark.rutland@arm.com Cc: Will Deacon will.deacon@arm.com Acked-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com (cherry picked from commit 8b5ea183eca527229785e39883359f162f43f58c) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: arch/arm64/kernel/cpufeature.c --- arch/arm64/kernel/cpufeature.c | 44 ++++++++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 21 deletions(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 86f4af2..0b9dff5 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -963,6 +963,26 @@ static void __init setup_elf_hwcaps(const struct arm64_cpu_capabilities *hwcaps) cap_set_elf_hwcap(hwcaps); }
+/* + * Check if the current CPU has a given feature capability. + * Should be called from non-preemptible context. + */ +static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array, + unsigned int cap) +{ + const struct arm64_cpu_capabilities *caps; + + if (WARN_ON(preemptible())) + return false; + + for (caps = cap_array; caps->desc; caps++) + if (caps->capability == cap && + caps->matches && + caps->matches(caps, SCOPE_LOCAL_CPU)) + return true; + return false; +} + void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps, const char *info) { @@ -1031,8 +1051,9 @@ verify_local_elf_hwcaps(const struct arm64_cpu_capabilities *caps) }
static void -verify_local_cpu_features(const struct arm64_cpu_capabilities *caps) +verify_local_cpu_features(const struct arm64_cpu_capabilities *caps_list) { + const struct arm64_cpu_capabilities *caps = caps_list; for (; caps->matches; caps++) { if (!cpus_have_cap(caps->capability)) continue; @@ -1040,7 +1061,7 @@ verify_local_cpu_features(const struct arm64_cpu_capabilities *caps) * If the new CPU misses an advertised feature, we cannot proceed * further, park the cpu. */ - if (!caps->matches(caps, SCOPE_LOCAL_CPU)) { + if (!__this_cpu_has_cap(caps_list, caps->capability)) { pr_crit("CPU%d: missing feature: %s\n", smp_processor_id(), caps->desc); cpu_die_early(); @@ -1093,25 +1114,6 @@ static void __init setup_feature_capabilities(void) enable_cpu_capabilities(arm64_features); }
-/* - * Check if the current CPU has a given feature capability. - * Should be called from non-preemptible context. - */ -static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array, - unsigned int cap) -{ - const struct arm64_cpu_capabilities *caps; - - if (WARN_ON(preemptible())) - return false; - - for (caps = cap_array; caps->desc; caps++) - if (caps->capability == cap && caps->matches) - return caps->matches(caps, SCOPE_LOCAL_CPU); - - return false; -} - extern const struct arm64_cpu_capabilities arm64_errata[];
bool this_cpu_has_cap(unsigned int cap)
From: Will Deacon will.deacon@arm.com
Speculation attacks against the entry trampoline can potentially resteer the speculative instruction stream through the indirect branch and into arbitrary gadgets within the kernel.
This patch defends against these attacks by forcing a misprediction through the return stack: a dummy BL instruction loads an entry into the stack, so that the predicted program flow of the subsequent RET instruction is to a branch-to-self instruction which is finally resolved as a branch to the kernel vectors with speculation suppressed.
Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 67ff2abfda2fdf2c2cc8f430e48e9c0df8996093) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/kernel/entry.S | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 69e6d1a..074c402 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -902,6 +902,14 @@ __ni_sys_trace: .if \regsize == 64 msr tpidrro_el0, x30 // Restored in kernel_ventry .endif + /* + * Defend against branch aliasing attacks by pushing a dummy + * entry onto the return stack and using a RET instruction to + * enter the full-fat kernel vectors. + */ + bl 2f + b . +2: tramp_map_kernel x30 #ifdef CONFIG_RANDOMIZE_BASE adr x30, tramp_vectors + PAGE_SIZE @@ -913,7 +921,7 @@ __ni_sys_trace: msr vbar_el1, x30 add x30, x30, #(1b - tramp_vectors) isb - br x30 + ret .endm
.macro tramp_exit, regsize = 64
From: Will Deacon will.deacon@arm.com
Although CONFIG_UNMAP_KERNEL_AT_EL0 does make KASLR more robust, it's actually more useful as a mitigation against speculation attacks that can leak arbitrary kernel data to userspace through speculation.
Reword the Kconfig help message to reflect this, and make the option depend on EXPERT so that it is on by default for the majority of users.
Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 82bd1be3b09d0ce67b2b91a8b1db0695129707eb) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/Kconfig | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index f660eaa..d7a516c 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -734,15 +734,14 @@ config FORCE_MAX_ZONEORDER 4M allocations matching the default size used by generic code.
config UNMAP_KERNEL_AT_EL0 - bool "Unmap kernel when running in userspace (aka "KAISER")" + bool "Unmap kernel when running in userspace (aka "KAISER")" if EXPERT default y help - Some attacks against KASLR make use of the timing difference between - a permission fault which could arise from a page table entry that is - present in the TLB, and a translation fault which always requires a - page table walk. This option defends against these attacks by unmapping - the kernel whilst running in userspace, therefore forcing translation - faults for all of kernel space. + Speculation attacks against some high-performance processors can + be used to bypass MMU permission checks and leak kernel data to + userspace. This can be defended against by unmapping the kernel + when running in userspace, mapping it back in on exception entry + via a trampoline page in the vector table.
If unsure, say Y.
From: Will Deacon will.deacon@arm.com
For non-KASLR kernels where the KPTI behaviour has not been overridden on the command line we can use ID_AA64PFR0_EL1.CSV3 to determine whether or not we should unmap the kernel whilst running at EL0.
Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 5fdc0c7508cdb2eea11b441134f5caf6ee9cf7c9) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: skip cpu features like SVE etc. and use 5 paramaters function ARM64_FTR_BITS() replace read_sanitised_ftr_reg with old name read_system_reg arch/arm64/include/asm/sysreg.h arch/arm64/kernel/cpufeature.c --- arch/arm64/include/asm/sysreg.h | 1 + arch/arm64/kernel/cpufeature.c | 8 +++++++- 2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 7393cc7..7cb7f7c 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -117,6 +117,7 @@ #define ID_AA64ISAR0_AES_SHIFT 4
/* id_aa64pfr0 */ +#define ID_AA64PFR0_CSV3_SHIFT 60 #define ID_AA64PFR0_GIC_SHIFT 24 #define ID_AA64PFR0_ASIMD_SHIFT 20 #define ID_AA64PFR0_FP_SHIFT 16 diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 0b9dff5..a77fb5b 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -98,6 +98,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = { ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64PFR0_GIC_SHIFT, 4, 0), S_ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI), S_ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI), + ARM64_FTR_BITS(FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0), /* Linux doesn't care about the EL3 */ ARM64_FTR_BITS(FTR_NONSTRICT, FTR_EXACT, ID_AA64PFR0_EL3_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64PFR0_EL2_SHIFT, 4, 0), @@ -752,6 +753,8 @@ static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry, int __unused) { + u64 pfr0 = read_system_reg(SYS_ID_AA64PFR0_EL1); + /* Forced on command line? */ if (__kpti_forced) { pr_info_once("kernel page table isolation forced %s by command line option\n", @@ -763,7 +766,9 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry, if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) return true;
- return false; + /* Defer to CPU feature registers */ + return !cpuid_feature_extract_unsigned_field(pfr0, + ID_AA64PFR0_CSV3_SHIFT); }
static int __init parse_kpti(char *str) @@ -865,6 +870,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = { }, #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 { + .desc = "Kernel page table isolation (KPTI)", .capability = ARM64_UNMAP_KERNEL_AT_EL0, .def_scope = SCOPE_SYSTEM, .matches = unmap_kernel_at_el0,
From: Will Deacon will.deacon@arm.com
Entry into recent versions of ARM Trusted Firmware will invalidate the CPU branch predictor state in order to protect against aliasing attacks.
This patch exposes the PSCI "VERSION" function via psci_ops, so that it can be invoked outside of the PSCI driver where necessary.
Acked-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit a6df699da786079b7e2249df3f3211be7862f0e1) Signed-off-by: Alex Shi alex.shi@linaro.org --- drivers/firmware/psci.c | 2 ++ include/linux/psci.h | 1 + 2 files changed, 3 insertions(+)
diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c index 8263429..9a3ce76 100644 --- a/drivers/firmware/psci.c +++ b/drivers/firmware/psci.c @@ -496,6 +496,8 @@ static void __init psci_init_migrate(void) static void __init psci_0_2_set_functions(void) { pr_info("Using standard PSCI v0.2 function IDs\n"); + psci_ops.get_version = psci_get_version; + psci_function_id[PSCI_FN_CPU_SUSPEND] = PSCI_FN_NATIVE(0_2, CPU_SUSPEND); psci_ops.cpu_suspend = psci_cpu_suspend; diff --git a/include/linux/psci.h b/include/linux/psci.h index bdea1cb..6306ab1 100644 --- a/include/linux/psci.h +++ b/include/linux/psci.h @@ -26,6 +26,7 @@ int psci_cpu_init_idle(unsigned int cpu); int psci_cpu_suspend_enter(unsigned long index);
struct psci_operations { + u32 (*get_version)(void); int (*cpu_suspend)(u32 state, unsigned long entry_point); int (*cpu_off)(u32 state); int (*cpu_on)(unsigned long cpuid, unsigned long entry_point);
From: Laura Abbott labbott@redhat.com
Certain architectures may have the kernel image mapped separately to alias the linear map. Introduce a macro lm_alias to translate a kernel image symbol into its linear alias. This is used in part with work to add CONFIG_DEBUG_VIRTUAL support for arm64.
Reviewed-by: Mark Rutland mark.rutland@arm.com Tested-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Laura Abbott labbott@redhat.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 568c5fe5a54f2654f5a4c599c45b8a62ed9a2013) Signed-off-by: Alex Shi alex.shi@linaro.org --- include/linux/mm.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h index 2217e2f..edd2480 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -76,6 +76,10 @@ extern int mmap_rnd_compat_bits __read_mostly; #define page_to_virt(x) __va(PFN_PHYS(page_to_pfn(x))) #endif
+#ifndef lm_alias +#define lm_alias(x) __va(__pa_symbol(x)) +#endif + /* * To prevent common memory management code establishing * a zero page mapping on a read fault.
We will soon need to invoke a CPU-specific function pointer after changing page tables, so move post_ttbr_update_workaround out into C code to make this possible.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 81659da6deabd66da571f82ece19539aa76e370c) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: don't include PAN related changes arch/arm64/include/asm/assembler.h arch/arm64/kernel/entry.S arch/arm64/mm/proc.S --- arch/arm64/mm/context.c | 9 +++++++++ arch/arm64/mm/proc.S | 7 +------ 2 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index f00f5ee..b9b0875 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -233,6 +233,15 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu) cpu_switch_mm(mm->pgd, mm); }
+/* Errata workaround post TTBRx_EL1 update. */ +asmlinkage void post_ttbr_update_workaround(void) +{ + asm(ALTERNATIVE("nop; nop; nop", + "ic iallu; dsb nsh; isb", + ARM64_WORKAROUND_CAVIUM_27456, + CONFIG_CAVIUM_ERRATUM_27456)); +} + static int asids_init(void) { asid_bits = get_cpu_asid_bits(); diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S index 3378f3e..9a2d800 100644 --- a/arch/arm64/mm/proc.S +++ b/arch/arm64/mm/proc.S @@ -139,12 +139,7 @@ ENTRY(cpu_do_switch_mm) isb msr ttbr0_el1, x0 // now update TTBR0 isb -alternative_if ARM64_WORKAROUND_CAVIUM_27456 - ic iallu - dsb nsh - isb -alternative_else_nop_endif - ret + b post_ttbr_update_workaround // Back to C code... ENDPROC(cpu_do_switch_mm)
.pushsection ".idmap.text", "ax"
Aliasing attacks against CPU branch predictors can allow an attacker to redirect speculative control flow on some CPUs and potentially divulge information from one context to another.
This patch adds initial skeleton code behind a new Kconfig option to enable implementation-specific mitigations against these attacks for CPUs that are affected.
Co-developed-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 7bd293b6845d003ab087faa6515a626c8703b8da) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: expand enable_da_f in entry.S use 5 parameters ARM64_FTR_BITS() add percpu.h in mm_types.h for percpu functions use cpus_have_cap instead of cpus_have_const_cap arch/arm64/include/asm/cpucaps.h arch/arm64/include/asm/sysreg.h arch/arm64/kernel/cpufeature.c arch/arm64/kernel/entry.S arch/arm64/mm/context.c arch/arm64/mm/fault.c --- arch/arm64/Kconfig | 17 +++++++++ arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/mmu.h | 37 ++++++++++++++++++++ arch/arm64/include/asm/sysreg.h | 1 + arch/arm64/kernel/Makefile | 4 +++ arch/arm64/kernel/bpi.S | 55 +++++++++++++++++++++++++++++ arch/arm64/kernel/cpu_errata.c | 74 ++++++++++++++++++++++++++++++++++++++++ arch/arm64/kernel/cpufeature.c | 1 + arch/arm64/kernel/entry.S | 7 ++-- arch/arm64/mm/context.c | 2 ++ arch/arm64/mm/fault.c | 16 +++++++++ include/linux/mm_types.h | 1 + 12 files changed, 215 insertions(+), 3 deletions(-) create mode 100644 arch/arm64/kernel/bpi.S
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index d7a516c..7c2d5ea 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -745,6 +745,23 @@ config UNMAP_KERNEL_AT_EL0
If unsure, say Y.
+config HARDEN_BRANCH_PREDICTOR + bool "Harden the branch predictor against aliasing attacks" if EXPERT + default y + help + Speculation attacks against some high-performance processors rely on + being able to manipulate the branch predictor for a victim context by + executing aliasing branches in the attacker context. Such attacks + can be partially mitigated against by clearing internal branch + predictor state and limiting the prediction logic in some situations. + + This config option will take CPU-specific actions to harden the + branch predictor against aliasing attacks and may rely on specific + instruction sequences or control bits being set by the system + firmware. + + If unsure, say Y. + menuconfig ARMV8_DEPRECATED bool "Emulate deprecated/obsolete ARMv8 instructions" depends on COMPAT diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index ec601db2..bb99b7f 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -35,6 +35,7 @@ #define ARM64_HYP_OFFSET_LOW 14 #define ARM64_MISMATCHED_CACHE_LINE_SIZE 15 #define ARM64_UNMAP_KERNEL_AT_EL0 16 -#define ARM64_NCAPS 17 +#define ARM64_HARDEN_BRANCH_PREDICTOR 17 +#define ARM64_NCAPS 18
#endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index a813edf..2a2ce3c 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -38,6 +38,43 @@ static inline bool arm64_kernel_unmapped_at_el0(void) cpus_have_cap(ARM64_UNMAP_KERNEL_AT_EL0); }
+typedef void (*bp_hardening_cb_t)(void); + +struct bp_hardening_data { + int hyp_vectors_slot; + bp_hardening_cb_t fn; +}; + +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR +extern char __bp_harden_hyp_vecs_start[], __bp_harden_hyp_vecs_end[]; + +DECLARE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); + +static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void) +{ + return this_cpu_ptr(&bp_hardening_data); +} + +static inline void arm64_apply_bp_hardening(void) +{ + struct bp_hardening_data *d; + + if (!cpus_have_cap(ARM64_HARDEN_BRANCH_PREDICTOR)) + return; + + d = arm64_get_bp_hardening_data(); + if (d->fn) + d->fn(); +} +#else +static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void) +{ + return NULL; +} + +static inline void arm64_apply_bp_hardening(void) { } +#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */ + extern void paging_init(void); extern void bootmem_init(void); extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt); diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 7cb7f7c..a692209 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -118,6 +118,7 @@
/* id_aa64pfr0 */ #define ID_AA64PFR0_CSV3_SHIFT 60 +#define ID_AA64PFR0_CSV2_SHIFT 56 #define ID_AA64PFR0_GIC_SHIFT 24 #define ID_AA64PFR0_ASIMD_SHIFT 20 #define ID_AA64PFR0_FP_SHIFT 16 diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 7d66bba..74b8fd8 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -51,6 +51,10 @@ arm64-obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o \ cpu-reset.o
+ifeq ($(CONFIG_KVM),y) +arm64-obj-$(CONFIG_HARDEN_BRANCH_PREDICTOR) += bpi.o +endif + obj-y += $(arm64-obj-y) vdso/ probes/ obj-m += $(arm64-obj-m) head-y := head.o diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S new file mode 100644 index 0000000..06a931e --- /dev/null +++ b/arch/arm64/kernel/bpi.S @@ -0,0 +1,55 @@ +/* + * Contains CPU specific branch predictor invalidation sequences + * + * Copyright (C) 2018 ARM Ltd. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see http://www.gnu.org/licenses/. + */ + +#include <linux/linkage.h> + +.macro ventry target + .rept 31 + nop + .endr + b \target +.endm + +.macro vectors target + ventry \target + 0x000 + ventry \target + 0x080 + ventry \target + 0x100 + ventry \target + 0x180 + + ventry \target + 0x200 + ventry \target + 0x280 + ventry \target + 0x300 + ventry \target + 0x380 + + ventry \target + 0x400 + ventry \target + 0x480 + ventry \target + 0x500 + ventry \target + 0x580 + + ventry \target + 0x600 + ventry \target + 0x680 + ventry \target + 0x700 + ventry \target + 0x780 +.endm + + .align 11 +ENTRY(__bp_harden_hyp_vecs_start) + .rept 4 + vectors __kvm_hyp_vector + .endr +ENTRY(__bp_harden_hyp_vecs_end) diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index b75e917..87b0163 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -46,6 +46,80 @@ static int cpu_enable_trap_ctr_access(void *__unused) return 0; }
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR +#include <asm/mmu_context.h> +#include <asm/cacheflush.h> + +DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); + +#ifdef CONFIG_KVM +static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start, + const char *hyp_vecs_end) +{ + void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K); + int i; + + for (i = 0; i < SZ_2K; i += 0x80) + memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start); + + flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K); +} + +static void __install_bp_hardening_cb(bp_hardening_cb_t fn, + const char *hyp_vecs_start, + const char *hyp_vecs_end) +{ + static int last_slot = -1; + static DEFINE_SPINLOCK(bp_lock); + int cpu, slot = -1; + + spin_lock(&bp_lock); + for_each_possible_cpu(cpu) { + if (per_cpu(bp_hardening_data.fn, cpu) == fn) { + slot = per_cpu(bp_hardening_data.hyp_vectors_slot, cpu); + break; + } + } + + if (slot == -1) { + last_slot++; + BUG_ON(((__bp_harden_hyp_vecs_end - __bp_harden_hyp_vecs_start) + / SZ_2K) <= last_slot); + slot = last_slot; + __copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end); + } + + __this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot); + __this_cpu_write(bp_hardening_data.fn, fn); + spin_unlock(&bp_lock); +} +#else +static void __install_bp_hardening_cb(bp_hardening_cb_t fn, + const char *hyp_vecs_start, + const char *hyp_vecs_end) +{ + __this_cpu_write(bp_hardening_data.fn, fn); +} +#endif /* CONFIG_KVM */ + +static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry, + bp_hardening_cb_t fn, + const char *hyp_vecs_start, + const char *hyp_vecs_end) +{ + u64 pfr0; + + if (!entry->matches(entry, SCOPE_LOCAL_CPU)) + return; + + pfr0 = read_cpuid(ID_AA64PFR0_EL1); + if (cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_CSV2_SHIFT)) + return; + + __install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end); +} +#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */ + #define MIDR_RANGE(model, min, max) \ .def_scope = SCOPE_LOCAL_CPU, \ .matches = is_affected_midr_range, \ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index a77fb5b..fbbf65f 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -99,6 +99,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = { S_ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI), S_ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI), ARM64_FTR_BITS(FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0), /* Linux doesn't care about the EL3 */ ARM64_FTR_BITS(FTR_NONSTRICT, FTR_EXACT, ID_AA64PFR0_EL3_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64PFR0_EL2_SHIFT, 4, 0), diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 074c402..bd39713 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -589,12 +589,15 @@ el0_ia: */ mrs x26, far_el1 // enable interrupts before calling the main handler - enable_dbg_and_irq + msr daifclr, #(8 | 4 | 1) +#ifdef CONFIG_TRACE_IRQFLAGS + bl trace_hardirqs_off +#endif ct_user_exit mov x0, x26 mov x1, x25 mov x2, sp - bl do_mem_abort + bl do_el0_ia_bp_hardening b ret_to_user el0_fpsimd_acc: /* diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index b9b0875..accf7ea 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -240,6 +240,8 @@ asmlinkage void post_ttbr_update_workaround(void) "ic iallu; dsb nsh; isb", ARM64_WORKAROUND_CAVIUM_27456, CONFIG_CAVIUM_ERRATUM_27456)); + + arm64_apply_bp_hardening(); }
static int asids_init(void) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 403fe9e..8e7e54d 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -590,6 +590,22 @@ asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr, arm64_notify_die("", regs, &info, esr); }
+asmlinkage void __exception do_el0_ia_bp_hardening(unsigned long addr, + unsigned int esr, + struct pt_regs *regs) +{ + /* + * We've taken an instruction abort from userspace and not yet + * re-enabled IRQs. If the address is a kernel address, apply + * BP hardening prior to enabling IRQs and pre-emption. + */ + if (addr > TASK_SIZE) + arm64_apply_bp_hardening(); + + local_irq_enable(); + do_mem_abort(addr, esr, regs); +} + /* * Handle stack alignment exceptions. */ diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index e8471c2..15a82f3 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -13,6 +13,7 @@ #include <linux/uprobes.h> #include <linux/page-flags-layout.h> #include <linux/workqueue.h> +#include <linux/percpu.h> #include <asm/page.h> #include <asm/mmu.h>
CC Marc Zyngier marc.zyngier@arm.com
On 01/31/2018 12:05 AM, Alex Shi wrote:
Aliasing attacks against CPU branch predictors can allow an attacker to redirect speculative control flow on some CPUs and potentially divulge information from one context to another.
This patch adds initial skeleton code behind a new Kconfig option to enable implementation-specific mitigations against these attacks for CPUs that are affected.
Co-developed-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 7bd293b6845d003ab087faa6515a626c8703b8da) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: expand enable_da_f in entry.S use 5 parameters ARM64_FTR_BITS() add percpu.h in mm_types.h for percpu functions use cpus_have_cap instead of cpus_have_const_cap
<snip>...
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR +#include <asm/mmu_context.h> +#include <asm/cacheflush.h>
+DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
+#ifdef CONFIG_KVM +static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
const char *hyp_vecs_end)
+{
- void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K)
Hi Will,
Here lm_alias get the virtual address of __bp_harden_hyp_vecs_start from 0xffff000008098800(for example) to 0xffff800008098800 Code basing on v4.15-rc3 works well for this new address 0xffff80000.... but code basing on v4.9 will report null page:
Unable to handle kernel paging request at virtual address ffff8000000....
And without the lm_alias, the kernel booted well. Another place used lm_alias is in kvm_get_hyp_vector(),
Is this safe to drop lm_alias? Or is there some place to request pages for aliaed adderss?
+static inline void *kvm_get_hyp_vector(void) +{ + struct bp_hardening_data *data = arm64_get_bp_hardening_data(); + void *vect = kvm_ksym_ref(__kvm_hyp_vector); + + if (data->fn) { + vect = __bp_harden_hyp_vecs_start + + data->hyp_vectors_slot * SZ_2K; + + if (!cpus_have_cap(ARM64_HAS_VIRT_HOST_EXTN)) + vect = lm_alias(vect);
Thanks Alex
- int i;
- for (i = 0; i < SZ_2K; i += 0x80)
memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
- flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
+}
<snip>...
Here lm_alias get the virtual address of __bp_harden_hyp_vecs_start from 0xffff000008098800(for example) to 0xffff800008098800 Code basing on v4.15-rc3 works well for this new address 0xffff80000.... but code basing on v4.9 will report null page:
Unable to handle kernel paging request at virtual address ffff8000000....
And without the lm_alias, the kernel booted well. Another place used lm_alias is in kvm_get_hyp_vector(),
Is this safe to drop lm_alias? Or is there some place to request pages for aliaed adderss?
After read through the lm_alias stories, I found v4.9 has no this support yet. So I remove this function, and qemu machine boot well.
The updated commit is here: https://git.linaro.org/kernel/linux-linaro-stable.git/log/?h=lts-v4.9-kpti 9a07085
Now, I am working on another issue on juno, cpu lockup 23s' with kvm support:
0.734497] Initramfs unpacking failed: junk in compressed archive [ 0.740906] hw perfevents: enabled with armv8_cortex_a72 PMU driver, 7 counters available [ 0.749082] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available [ 0.757280] kvm [1]: 8-bit VMID [ 0.760394] kvm [1]: IDMAP page: 808a7000 [ 0.764363] kvm [1]: HYP VA range: 800000000000:ffffffffffff [ 0.770554] kvm [1]: Hyp mode initialized successfully [ 21.777786] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 21.783392] 1-...: (7 GPs behind) idle=3c1/1/0 softirq=42/42 fqs=2626 [ 21.789934] (detected by 2, t=5255 jiffies, g=-271, c=-272, q=1) [ 21.795967] Task dump for CPU 1: [ 21.799156] swapper/1 R running task 0 0 1 0x00000002 [ 21.806139] Call trace: [ 21.808563] [<ffff000008085600>] __switch_to+0x88/0xb0 [ 21.813644] [<0000000000000007>] 0x7 [ 28.241786] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [swapper/0:1] [ 28.249016] Modules linked in: [ 28.252035] [ 28.253507] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.9.78-00037-g9a07085 #76 [ 28.260737] Hardware name: ARM Juno development board (r2) (DT) [ 28.266592] task: ffff800976ca8000 task.stack: ffff800976cb0000 [ 28.272455] PC is at smp_call_function_many+0x26c/0x2d0 [ 28.277624] LR is at smp_call_function_many+0x22c/0x2d0 [ 28.282791] pc : [<ffff00000812bcdc>] lr : [<ffff00000812bc9c>] pstate: 80000045 [ 28.290108] sp : ffff800976cb3c40 [ 28.293382] x29: ffff800976cb3c40 x28: ffff000008d67000 [ 28.298642] x27: ffff000008d67384 x26: 0000000000000040 [ 28.303903] x25: 0000000000000000 x24: ffff0000080a6670 [ 28.309162] x23: 0000000000000001 x22: ffff000008d2a880 [ 28.314422] x21: ffff80097ff88980 x20: ffff80097ff88988 [ 28.319681] x19: ffff000008d67530 x18: 0000000000000010 [ 28.324941] x17: 0000000000000000 x16: 0000000100000000 [ 28.330200] x15: ffff000088e43c27 x14: 0000000000000006 [ 28.335459] x13: ffff000008e43c35 x12: 0000000000000009
Thanks Alex
On 31/01/18 08:18, Alex Shi wrote:
Here lm_alias get the virtual address of __bp_harden_hyp_vecs_start from 0xffff000008098800(for example) to 0xffff800008098800 Code basing on v4.15-rc3 works well for this new address 0xffff80000.... but code basing on v4.9 will report null page:
Unable to handle kernel paging request at virtual address ffff8000000....
And without the lm_alias, the kernel booted well. Another place used lm_alias is in kvm_get_hyp_vector(),
Is this safe to drop lm_alias? Or is there some place to request pages for aliaed adderss?
After read through the lm_alias stories, I found v4.9 has no this support yet. So I remove this function, and qemu machine boot well.
The updated commit is here: https://git.linaro.org/kernel/linux-linaro-stable.git/log/?h=lts-v4.9-kpti 9a07085
Now, I am working on another issue on juno, cpu lockup 23s' with kvm support:
0.734497] Initramfs unpacking failed: junk in compressed archive [ 0.740906] hw perfevents: enabled with armv8_cortex_a72 PMU driver, 7 counters available [ 0.749082] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available [ 0.757280] kvm [1]: 8-bit VMID [ 0.760394] kvm [1]: IDMAP page: 808a7000 [ 0.764363] kvm [1]: HYP VA range: 800000000000:ffffffffffff [ 0.770554] kvm [1]: Hyp mode initialized successfully [ 21.777786] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 21.783392] 1-...: (7 GPs behind) idle=3c1/1/0 softirq=42/42 fqs=2626 [ 21.789934] (detected by 2, t=5255 jiffies, g=-271, c=-272, q=1) [ 21.795967] Task dump for CPU 1: [ 21.799156] swapper/1 R running task 0 0 1 0x00000002 [ 21.806139] Call trace: [ 21.808563] [<ffff000008085600>] __switch_to+0x88/0xb0 [ 21.813644] [<0000000000000007>] 0x7 [ 28.241786] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [swapper/0:1] [ 28.249016] Modules linked in: [ 28.252035] [ 28.253507] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.9.78-00037-g9a07085 #76 [ 28.260737] Hardware name: ARM Juno development board (r2) (DT) [ 28.266592] task: ffff800976ca8000 task.stack: ffff800976cb0000 [ 28.272455] PC is at smp_call_function_many+0x26c/0x2d0 [ 28.277624] LR is at smp_call_function_many+0x22c/0x2d0 [ 28.282791] pc : [<ffff00000812bcdc>] lr : [<ffff00000812bc9c>] pstate: 80000045 [ 28.290108] sp : ffff800976cb3c40 [ 28.293382] x29: ffff800976cb3c40 x28: ffff000008d67000 [ 28.298642] x27: ffff000008d67384 x26: 0000000000000040 [ 28.303903] x25: 0000000000000000 x24: ffff0000080a6670 [ 28.309162] x23: 0000000000000001 x22: ffff000008d2a880 [ 28.314422] x21: ffff80097ff88980 x20: ffff80097ff88988 [ 28.319681] x19: ffff000008d67530 x18: 0000000000000010 [ 28.324941] x17: 0000000000000000 x16: 0000000100000000 [ 28.330200] x15: ffff000088e43c27 x14: 0000000000000006 [ 28.335459] x13: ffff000008e43c35 x12: 0000000000000009
<educated-guess> You've entered HYP with a bogus address (likely bogus vectors), and you're stuck on a recursive exception. </educated-guess>
M.
Alex,
On 31/01/18 05:57, Alex Shi wrote:
CC Marc Zyngier marc.zyngier@arm.com
On 01/31/2018 12:05 AM, Alex Shi wrote:
Aliasing attacks against CPU branch predictors can allow an attacker to redirect speculative control flow on some CPUs and potentially divulge information from one context to another.
This patch adds initial skeleton code behind a new Kconfig option to enable implementation-specific mitigations against these attacks for CPUs that are affected.
Co-developed-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 7bd293b6845d003ab087faa6515a626c8703b8da) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: expand enable_da_f in entry.S use 5 parameters ARM64_FTR_BITS() add percpu.h in mm_types.h for percpu functions use cpus_have_cap instead of cpus_have_const_cap
<snip>...
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR +#include <asm/mmu_context.h> +#include <asm/cacheflush.h>
+DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
+#ifdef CONFIG_KVM +static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
const char *hyp_vecs_end)
+{
- void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K)
Hi Will,
Here lm_alias get the virtual address of __bp_harden_hyp_vecs_start from 0xffff000008098800(for example) to 0xffff800008098800 Code basing on v4.15-rc3 works well for this new address 0xffff80000.... but code basing on v4.9 will report null page:
Unable to handle kernel paging request at virtual address ffff8000000....
And without the lm_alias, the kernel booted well. Another place used lm_alias is in kvm_get_hyp_vector(),
Is this safe to drop lm_alias? Or is there some place to request pages for aliaed adderss?
The question you have to ask yourself is whether or not the address you're getting for __bp_harden_hyp_vecs_start is from the linear mapping or from another range. If the former, lm_alias is not necessary (but you may want to find out why it gives you something that is unexpected). If the latter, then you really need to translate it to the linear map, as you're not going to be able to write to kernel text via its execution mapping.
As for kvm_get_hyp_vector, same thing. We only map the linear map at EL2, so you really need to pick the right set of VAs, or kern_hyp_va is going to point you to lalaland (and that will be pretty final).
I cannot page 4.9 in at the moment, but some limited investigation should help you finding out how we used to map the kernel last year.
Thanks,
M.
+static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
const char *hyp_vecs_end)
+{
- void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K)
Hi Will,
Here lm_alias get the virtual address of __bp_harden_hyp_vecs_start from 0xffff000008098800(for example) to 0xffff800008098800 Code basing on v4.15-rc3 works well for this new address 0xffff80000.... but code basing on v4.9 will report null page:
Unable to handle kernel paging request at virtual address ffff8000000....
And without the lm_alias, the kernel booted well. Another place used lm_alias is in kvm_get_hyp_vector(),
Is this safe to drop lm_alias? Or is there some place to request pages for aliaed adderss?
The question you have to ask yourself is whether or not the address you're getting for __bp_harden_hyp_vecs_start is from the linear mapping or from another range. If the former, lm_alias is not necessary (but you may want to find out why it gives you something that is unexpected). If the latter, then you really need to translate it to the linear map, as you're not going to be able to write to kernel text via its execution mapping.
Hi Marc,
Thanks a lot for help! :)
Seems I still stuck and confused in this address alia issue. Is there some shared vector need accessed from both host(hyp) and kvm(normal kernel)? or hyp need copy some vectors to (raw address - kimg) for itself? And if not in hyp, kernel only use raw address?
I still confused on lm_alias using, because, the v4.15 kernel run on EL1 which works with lm_alias address 0xffff80... but v4.9 kernel only works with raw 'dst' address 0xffff00... on EL1. And the same time, juno r2 run on EL2 which report null address on raw address 0xffff00...
As for kvm_get_hyp_vector, same thing. We only map the linear map at EL2, so you really need to pick the right set of VAs, or kern_hyp_va is going to point you to lalaland (and that will be pretty final).
the hyp runs in el2 and use lm_aliaed address?
I cannot page 4.9 in at the moment, but some limited investigation should help you finding out how we used to map the kernel last year.
Thanks,
M.
On 31/01/18 12:39, Alex Shi wrote:
+static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
const char *hyp_vecs_end)
+{
- void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K)
Hi Will,
Here lm_alias get the virtual address of __bp_harden_hyp_vecs_start from 0xffff000008098800(for example) to 0xffff800008098800 Code basing on v4.15-rc3 works well for this new address 0xffff80000.... but code basing on v4.9 will report null page:
Unable to handle kernel paging request at virtual address ffff8000000....
And without the lm_alias, the kernel booted well. Another place used lm_alias is in kvm_get_hyp_vector(),
Is this safe to drop lm_alias? Or is there some place to request pages for aliaed adderss?
The question you have to ask yourself is whether or not the address you're getting for __bp_harden_hyp_vecs_start is from the linear mapping or from another range. If the former, lm_alias is not necessary (but you may want to find out why it gives you something that is unexpected). If the latter, then you really need to translate it to the linear map, as you're not going to be able to write to kernel text via its execution mapping.
Hi Marc,
Thanks a lot for help! :)
Seems I still stuck and confused in this address alia issue. Is there some shared vector need accessed from both host(hyp) and kvm(normal kernel)? or hyp need copy some vectors to (raw address - kimg) for itself? And if not in hyp, kernel only use raw address?
I really don't understand your questions, so let me explain how things work:
- The kernel embeds all of the KVM text. Some of that text is meant to be mapped at EL2.
- All the mappings at HYP are at an offset from the linear mapping, and you can convert a linear mapping VA to a HYP VA using kern_hyp_va().
- If you have a kernel address, you have to convert it to a linear map address first before feeding it to the hypervisor.
- Nothing is copied from kernel to HYP. Things are just mapped differently. The only copy is part of this patch, and generates the code in place. It doesn't affect how things are mapped.
- We only map a tiny bit of the linear map in HYP (HYP text, vcpu and kvm structures).
I still confused on lm_alias using, because, the v4.15 kernel run on EL1 which works with lm_alias address 0xffff80... but v4.9 kernel only works with raw 'dst' address 0xffff00... on EL1. And the same time, juno r2 run on EL2 which report null address on raw address 0xffff00...
What do you mean by "null address" or "raw address"?
As for kvm_get_hyp_vector, same thing. We only map the linear map at EL2, so you really need to pick the right set of VAs, or kern_hyp_va is going to point you to lalaland (and that will be pretty final).
the hyp runs in el2 and use lm_aliaed address?
See above. If you have a kernel address and want to convert it to a HYP address, you must turn it into a linear map address first, then turn it into a HYP VA.
M.
I really don't understand your questions, so let me explain how things work:
Sorry for my idiot on virt machine. And many thanks for patient explanation!
From the doc Documentation/virtual/kvm/arm/hyp-abi.txt, I guess the correct concept is KVM is a hypervisor.
- The kernel embeds all of the KVM text. Some of that text is meant to
be mapped at EL2.
- All the mappings at HYP are at an offset from the linear mapping, and
you can convert a linear mapping VA to a HYP VA using kern_hyp_va().
why we need this mapping? and who/when did this mapping? Both of address are accessed from same EL level?
It looks like the base kernel difference between 4.9 and 4.15-rc3 (https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti) cause this boot panic.
Guess the 4.15-rc3 did the linear mapping for __bp_harden_hyp_vecs_start while 4.9 didn't. So, in this patch, __bp_harden_hyp_vecs_start need be accessed with difference address. I thought that since I can not figure the map change during my backporting.
The backported patch for 3 commits is here: arm64: Implement branch predictor hardening for affected Cortex-A CPUs arm64: KVM: Use per-CPU vector when BP hardening is enabled arm64: Add skeleton to harden the branch predictor against aliasing attacks
What I changed compare to original version: mv changes from virt/kvm/arm/arm.c to arch/arm/kvm/arm.c expand enable_da_f in entry.S use 5 parameters ARM64_FTR_BITS() add percpu.h in mm_types.h for percpu functions use cpus_have_cap instead of cpus_have_const_cap and remove lm_alias in bp vector
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h index a58bbaa..d10e362 100644 --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -223,6 +223,16 @@ static inline unsigned int kvm_get_vmid_bits(void) return 8; }
+static inline void *kvm_get_hyp_vector(void) +{ + return kvm_ksym_ref(__kvm_hyp_vector); +} + +static inline int kvm_map_vectors(void) +{ + return 0; +} + #endif /* !__ASSEMBLY__ */
#endif /* __ARM_KVM_MMU_H__ */ diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 19b5f5c..3539fc4 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -1088,7 +1088,7 @@ static void cpu_init_hyp_mode(void *dummy) pgd_ptr = kvm_mmu_get_httbr(); stack_page = __this_cpu_read(kvm_arm_hyp_stack_page); hyp_stack_ptr = stack_page + PAGE_SIZE; - vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector); + vector_ptr = (unsigned long)kvm_get_hyp_vector();
__cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr); __cpu_init_stage2(); @@ -1344,6 +1344,12 @@ static int init_hyp_mode(void) goto out_err; }
+ err = kvm_map_vectors(); + if (err) { + kvm_err("Cannot map vectors\n"); + goto out_err; + } + /* * Map the Hyp stack pages */ diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index d7a516c..7c2d5ea 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -745,6 +745,23 @@ config UNMAP_KERNEL_AT_EL0
If unsure, say Y.
+config HARDEN_BRANCH_PREDICTOR + bool "Harden the branch predictor against aliasing attacks" if EXPERT + default y + help + Speculation attacks against some high-performance processors rely on + being able to manipulate the branch predictor for a victim context by + executing aliasing branches in the attacker context. Such attacks + can be partially mitigated against by clearing internal branch + predictor state and limiting the prediction logic in some situations. + + This config option will take CPU-specific actions to harden the + branch predictor against aliasing attacks and may rely on specific + instruction sequences or control bits being set by the system + firmware. + + If unsure, say Y. + menuconfig ARMV8_DEPRECATED bool "Emulate deprecated/obsolete ARMv8 instructions" depends on COMPAT diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index ec601db2..bb99b7f 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -35,6 +35,7 @@ #define ARM64_HYP_OFFSET_LOW 14 #define ARM64_MISMATCHED_CACHE_LINE_SIZE 15 #define ARM64_UNMAP_KERNEL_AT_EL0 16 -#define ARM64_NCAPS 17 +#define ARM64_HARDEN_BRANCH_PREDICTOR 17 +#define ARM64_NCAPS 18
#endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 6d22017..80bf337 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -313,5 +313,43 @@ static inline unsigned int kvm_get_vmid_bits(void) return (cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8; }
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR +#include <asm/mmu.h> + +static inline void *kvm_get_hyp_vector(void) +{ + struct bp_hardening_data *data = arm64_get_bp_hardening_data(); + void *vect = kvm_ksym_ref(__kvm_hyp_vector); + + if (data->fn) { + vect = __bp_harden_hyp_vecs_start + + data->hyp_vectors_slot * SZ_2K; + + if (!cpus_have_cap(ARM64_HAS_VIRT_HOST_EXTN)) + vect = lm_alias(vect); + } + + return vect; +} + +static inline int kvm_map_vectors(void) +{ + return create_hyp_mappings(kvm_ksym_ref(__bp_harden_hyp_vecs_start), + kvm_ksym_ref(__bp_harden_hyp_vecs_end), + PAGE_HYP_EXEC); +} + +#else +static inline void *kvm_get_hyp_vector(void) +{ + return kvm_ksym_ref(__kvm_hyp_vector); +} + +static inline int kvm_map_vectors(void) +{ + return 0; +} +#endif + #endif /* __ASSEMBLY__ */ #endif /* __ARM64_KVM_MMU_H__ */ diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index a813edf..2a2ce3c 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -38,6 +38,43 @@ static inline bool arm64_kernel_unmapped_at_el0(void) cpus_have_cap(ARM64_UNMAP_KERNEL_AT_EL0); }
+typedef void (*bp_hardening_cb_t)(void); + +struct bp_hardening_data { + int hyp_vectors_slot; + bp_hardening_cb_t fn; +}; + +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR +extern char __bp_harden_hyp_vecs_start[], __bp_harden_hyp_vecs_end[]; + +DECLARE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); + +static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void) +{ + return this_cpu_ptr(&bp_hardening_data); +} + +static inline void arm64_apply_bp_hardening(void) +{ + struct bp_hardening_data *d; + + if (!cpus_have_cap(ARM64_HARDEN_BRANCH_PREDICTOR)) + return; + + d = arm64_get_bp_hardening_data(); + if (d->fn) + d->fn(); +} +#else +static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void) +{ + return NULL; +} + +static inline void arm64_apply_bp_hardening(void) { } +#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */ + extern void paging_init(void); extern void bootmem_init(void); extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt); diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 7cb7f7c..a692209 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -118,6 +118,7 @@
/* id_aa64pfr0 */ #define ID_AA64PFR0_CSV3_SHIFT 60 +#define ID_AA64PFR0_CSV2_SHIFT 56 #define ID_AA64PFR0_GIC_SHIFT 24 #define ID_AA64PFR0_ASIMD_SHIFT 20 #define ID_AA64PFR0_FP_SHIFT 16 diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 7d66bba..74b8fd8 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -51,6 +51,10 @@ arm64-obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o \ cpu-reset.o
+ifeq ($(CONFIG_KVM),y) +arm64-obj-$(CONFIG_HARDEN_BRANCH_PREDICTOR) += bpi.o +endif + obj-y += $(arm64-obj-y) vdso/ probes/ obj-m += $(arm64-obj-m) head-y := head.o diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S new file mode 100644 index 0000000..dec95bd --- /dev/null +++ b/arch/arm64/kernel/bpi.S @@ -0,0 +1,79 @@ +/* + * Contains CPU specific branch predictor invalidation sequences + * + * Copyright (C) 2018 ARM Ltd. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see http://www.gnu.org/licenses/. + */ + +#include <linux/linkage.h> + +.macro ventry target + .rept 31 + nop + .endr + b \target +.endm + +.macro vectors target + ventry \target + 0x000 + ventry \target + 0x080 + ventry \target + 0x100 + ventry \target + 0x180 + + ventry \target + 0x200 + ventry \target + 0x280 + ventry \target + 0x300 + ventry \target + 0x380 + + ventry \target + 0x400 + ventry \target + 0x480 + ventry \target + 0x500 + ventry \target + 0x580 + + ventry \target + 0x600 + ventry \target + 0x680 + ventry \target + 0x700 + ventry \target + 0x780 +.endm + + .align 11 +ENTRY(__bp_harden_hyp_vecs_start) + .rept 4 + vectors __kvm_hyp_vector + .endr +ENTRY(__bp_harden_hyp_vecs_end) +ENTRY(__psci_hyp_bp_inval_start) + sub sp, sp, #(8 * 18) + stp x16, x17, [sp, #(16 * 0)] + stp x14, x15, [sp, #(16 * 1)] + stp x12, x13, [sp, #(16 * 2)] + stp x10, x11, [sp, #(16 * 3)] + stp x8, x9, [sp, #(16 * 4)] + stp x6, x7, [sp, #(16 * 5)] + stp x4, x5, [sp, #(16 * 6)] + stp x2, x3, [sp, #(16 * 7)] + stp x0, x1, [sp, #(16 * 8)] + mov x0, #0x84000000 + smc #0 + ldp x16, x17, [sp, #(16 * 0)] + ldp x14, x15, [sp, #(16 * 1)] + ldp x12, x13, [sp, #(16 * 2)] + ldp x10, x11, [sp, #(16 * 3)] + ldp x8, x9, [sp, #(16 * 4)] + ldp x6, x7, [sp, #(16 * 5)] + ldp x4, x5, [sp, #(16 * 6)] + ldp x2, x3, [sp, #(16 * 7)] + ldp x0, x1, [sp, #(16 * 8)] + add sp, sp, #(8 * 18) +ENTRY(__psci_hyp_bp_inval_end) diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index c66a673c..cded268 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -46,6 +46,100 @@ static int cpu_enable_trap_ctr_access(void *__unused) return 0; }
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR +#include <asm/mmu_context.h> +#include <asm/cacheflush.h> + +DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); + +#ifdef CONFIG_KVM +extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[]; + +static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start, + const char *hyp_vecs_end) +{ + void *dst = __bp_harden_hyp_vecs_start + slot * SZ_2K; + int i; + + for (i = 0; i < SZ_2K; i += 0x80) + memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start); + + flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K); +} + +static void __install_bp_hardening_cb(bp_hardening_cb_t fn, + const char *hyp_vecs_start, + const char *hyp_vecs_end) +{ + static int last_slot = -1; + static DEFINE_SPINLOCK(bp_lock); + int cpu, slot = -1; + + spin_lock(&bp_lock); + for_each_possible_cpu(cpu) { + if (per_cpu(bp_hardening_data.fn, cpu) == fn) { + slot = per_cpu(bp_hardening_data.hyp_vectors_slot, cpu); + break; + } + } + + if (slot == -1) { + last_slot++; + BUG_ON(((__bp_harden_hyp_vecs_end - __bp_harden_hyp_vecs_start) + / SZ_2K) <= last_slot); + slot = last_slot; + __copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end); + } + + __this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot); + __this_cpu_write(bp_hardening_data.fn, fn); + spin_unlock(&bp_lock); +} +#else +#define __psci_hyp_bp_inval_start NULL +#define __psci_hyp_bp_inval_end NULL + +static void __install_bp_hardening_cb(bp_hardening_cb_t fn, + const char *hyp_vecs_start, + const char *hyp_vecs_end) +{ + __this_cpu_write(bp_hardening_data.fn, fn); +} +#endif /* CONFIG_KVM */ + +static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry, + bp_hardening_cb_t fn, + const char *hyp_vecs_start, + const char *hyp_vecs_end) +{ + u64 pfr0; + + if (!entry->matches(entry, SCOPE_LOCAL_CPU)) + return; + + pfr0 = read_cpuid(ID_AA64PFR0_EL1); + if (cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_CSV2_SHIFT)) + return; + + __install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end); +} + +#include <linux/psci.h> + +static int enable_psci_bp_hardening(void *data) +{ + const struct arm64_cpu_capabilities *entry = data; + + if (psci_ops.get_version) + install_bp_hardening_cb(entry, + (bp_hardening_cb_t)psci_ops.get_version, + __psci_hyp_bp_inval_start, + __psci_hyp_bp_inval_end); + + return 0; +} +#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */ + #define MIDR_RANGE(model, min, max) \ .def_scope = SCOPE_LOCAL_CPU, \ .matches = is_affected_midr_range, \ @@ -137,6 +231,28 @@ const struct arm64_cpu_capabilities arm64_errata[] = { .def_scope = SCOPE_LOCAL_CPU, .enable = cpu_enable_trap_ctr_access, }, +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_CORTEX_A57), + .enable = enable_psci_bp_hardening, + }, + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_CORTEX_A72), + .enable = enable_psci_bp_hardening, + }, + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_CORTEX_A73), + .enable = enable_psci_bp_hardening, + }, + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_CORTEX_A75), + .enable = enable_psci_bp_hardening, + }, +#endif { } }; diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index a77fb5b..fbbf65f 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -99,6 +99,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = { S_ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI), S_ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI), ARM64_FTR_BITS(FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0), /* Linux doesn't care about the EL3 */ ARM64_FTR_BITS(FTR_NONSTRICT, FTR_EXACT, ID_AA64PFR0_EL3_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, ID_AA64PFR0_EL2_SHIFT, 4, 0), diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 074c402..bd39713 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -589,12 +589,15 @@ el0_ia: */ mrs x26, far_el1 // enable interrupts before calling the main handler - enable_dbg_and_irq + msr daifclr, #(8 | 4 | 1) +#ifdef CONFIG_TRACE_IRQFLAGS + bl trace_hardirqs_off +#endif ct_user_exit mov x0, x26 mov x1, x25 mov x2, sp - bl do_mem_abort + bl do_el0_ia_bp_hardening b ret_to_user el0_fpsimd_acc: /* diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 9b082cc..3eab6ac 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -51,7 +51,7 @@ static void __hyp_text __activate_traps_vhe(void) val &= ~CPACR_EL1_FPEN; write_sysreg(val, cpacr_el1);
- write_sysreg(__kvm_hyp_vector, vbar_el1); + write_sysreg(kvm_get_hyp_vector(), vbar_el1); }
static void __hyp_text __activate_traps_nvhe(void) diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index b9b0875..accf7ea 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -240,6 +240,8 @@ asmlinkage void post_ttbr_update_workaround(void) "ic iallu; dsb nsh; isb", ARM64_WORKAROUND_CAVIUM_27456, CONFIG_CAVIUM_ERRATUM_27456)); + + arm64_apply_bp_hardening(); }
static int asids_init(void) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 403fe9e..8e7e54d 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -590,6 +590,22 @@ asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr, arm64_notify_die("", regs, &info, esr); }
+asmlinkage void __exception do_el0_ia_bp_hardening(unsigned long addr, + unsigned int esr, + struct pt_regs *regs) +{ + /* + * We've taken an instruction abort from userspace and not yet + * re-enabled IRQs. If the address is a kernel address, apply + * BP hardening prior to enabling IRQs and pre-emption. + */ + if (addr > TASK_SIZE) + arm64_apply_bp_hardening(); + + local_irq_enable(); + do_mem_abort(addr, esr, regs); +} + /* * Handle stack alignment exceptions. */ diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index e8471c2..15a82f3 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -13,6 +13,7 @@ #include <linux/uprobes.h> #include <linux/page-flags-layout.h> #include <linux/workqueue.h> +#include <linux/percpu.h> #include <asm/page.h> #include <asm/mmu.h>
On 01/02/18 09:31, Alex Shi wrote:
I really don't understand your questions, so let me explain how things work:
Sorry for my idiot on virt machine. And many thanks for patient explanation!
From the doc Documentation/virtual/kvm/arm/hyp-abi.txt, I guess the correct concept is KVM is a hypervisor.
- The kernel embeds all of the KVM text. Some of that text is meant to
be mapped at EL2.
- All the mappings at HYP are at an offset from the linear mapping, and
you can convert a linear mapping VA to a HYP VA using kern_hyp_va().
why we need this mapping? and who/when did this mapping? Both of address are accessed from same EL level?
We need this mapping because EL2 cannot use the same VAs as EL1. Only only has a single TTBR, and thus cannot use negative addressing. The page tables are created by EL1, and only EL2 is accessing memory via this mapping.
That's how KVM/arm64 worked since the beginning of times, and not much has changed since then.
It looks like the base kernel difference between 4.9 and 4.15-rc3 (https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti) cause this boot panic.
Guess the 4.15-rc3 did the linear mapping for __bp_harden_hyp_vecs_start while 4.9 didn't. So, in this patch, __bp_harden_hyp_vecs_start need be accessed with difference address. I thought that since I can not figure the map change during my backporting.
[...]
Sorry, I cannot be of help here, other than doing the backport myself (and I don't have the bandwidth for that now).
M.
On 02/01/2018 05:43 PM, Marc Zyngier wrote:
On 01/02/18 09:31, Alex Shi wrote:
I really don't understand your questions, so let me explain how things work:
Sorry for my idiot on virt machine. And many thanks for patient explanation!
From the doc Documentation/virtual/kvm/arm/hyp-abi.txt, I guess the correct concept is KVM is a hypervisor.
- The kernel embeds all of the KVM text. Some of that text is meant to
be mapped at EL2.
- All the mappings at HYP are at an offset from the linear mapping, and
you can convert a linear mapping VA to a HYP VA using kern_hyp_va().
why we need this mapping? and who/when did this mapping? Both of address are accessed from same EL level?
We need this mapping because EL2 cannot use the same VAs as EL1. Only only has a single TTBR, and thus cannot use negative addressing. The page tables are created by EL1, and only EL2 is accessing memory via this mapping.
That's how KVM/arm64 worked since the beginning of times, and not much has changed since then.
Hi Marc,
Many thanks for the info!
Would you like to tell me when the specific __bp_harden_hyp_vecs_start or bp_hardening_data got linear mapped?
Thanks Alex
On 02/01/2018 06:23 PM, Alex Shi wrote:
On 02/01/2018 05:43 PM, Marc Zyngier wrote:
On 01/02/18 09:31, Alex Shi wrote:
I really don't understand your questions, so let me explain how things work:
Sorry for my idiot on virt machine. And many thanks for patient explanation!
From the doc Documentation/virtual/kvm/arm/hyp-abi.txt, I guess the correct concept is KVM is a hypervisor.
- The kernel embeds all of the KVM text. Some of that text is meant to
be mapped at EL2.
- All the mappings at HYP are at an offset from the linear mapping, and
you can convert a linear mapping VA to a HYP VA using kern_hyp_va().
why we need this mapping? and who/when did this mapping? Both of address are accessed from same EL level?
We need this mapping because EL2 cannot use the same VAs as EL1. Only only has a single TTBR, and thus cannot use negative addressing. The page tables are created by EL1, and only EL2 is accessing memory via this mapping.
That's how KVM/arm64 worked since the beginning of times, and not much has changed since then.
Hi Marc,
Many thanks for the info!
Would you like to tell me when the specific __bp_harden_hyp_vecs_start or bp_hardening_data got linear mapped?
What I seen is, if cpu boot in EL1, above variable wasn't linear mapped. If cpu boot in EL2, then above variable seems linear mapped before install bp vector, and then the kvm_map_vectors() do mapping them later.
What I missed?
Thank a lot!
On 01/02/18 10:23, Alex Shi wrote:
On 02/01/2018 05:43 PM, Marc Zyngier wrote:
On 01/02/18 09:31, Alex Shi wrote:
I really don't understand your questions, so let me explain how things work:
Sorry for my idiot on virt machine. And many thanks for patient explanation!
From the doc Documentation/virtual/kvm/arm/hyp-abi.txt, I guess the correct concept is KVM is a hypervisor.
- The kernel embeds all of the KVM text. Some of that text is meant to
be mapped at EL2.
- All the mappings at HYP are at an offset from the linear mapping, and
you can convert a linear mapping VA to a HYP VA using kern_hyp_va().
why we need this mapping? and who/when did this mapping? Both of address are accessed from same EL level?
We need this mapping because EL2 cannot use the same VAs as EL1. Only only has a single TTBR, and thus cannot use negative addressing. The page tables are created by EL1, and only EL2 is accessing memory via this mapping.
That's how KVM/arm64 worked since the beginning of times, and not much has changed since then.
Hi Marc,
Many thanks for the info!
Would you like to tell me when the specific __bp_harden_hyp_vecs_start or bp_hardening_data got linear mapped?
The whole of the memory is in the linear map, from very early boot. Hence it contains the kernel and everything else. We don't do anything specific to map thing in the linear map. Instead, we selectively map bits of it at other virtual addresses (the kernel gets its own executable mapping, the HYP code gets mapped at EL2...).
Thanks,
M.
On 01/31/2018 08:39 PM, Alex Shi wrote:
+static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
const char *hyp_vecs_end)
+{
- void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K)
Hi Will,
Here lm_alias get the virtual address of __bp_harden_hyp_vecs_start from 0xffff000008098800(for example) to 0xffff800008098800 Code basing on v4.15-rc3 works well for this new address 0xffff80000.... but code basing on v4.9 will report null page:
Unable to handle kernel paging request at virtual address ffff8000000....
And without the lm_alias, the kernel booted well. Another place used lm_alias is in kvm_get_hyp_vector(),
Is this safe to drop lm_alias? Or is there some place to request pages for aliaed adderss?
The question you have to ask yourself is whether or not the address you're getting for __bp_harden_hyp_vecs_start is from the linear mapping or from another range. If the former, lm_alias is not necessary (but you may want to find out why it gives you something that is unexpected). If the latter, then you really need to translate it to the linear map, as you're not going to be able to write to kernel text via its execution mapping.
Hi Marc,
Thanks a lot for help! :)
Seems I still stuck and confused in this address alia issue. Is there some shared vector need accessed from both host(hyp) and kvm(normal kernel)? or hyp need copy some vectors to (raw address - kimg) for itself? And if not in hyp, kernel only use raw address?
I still confused on lm_alias using, because, the v4.15 kernel run on EL1 which works with lm_alias address 0xffff80... but v4.9 kernel only works with raw 'dst' address 0xffff00... on EL1. And the same time, juno r2 run on EL2 which report null address on raw address 0xffff00...
BTW, kvm_ksym_ref will return address - kimg or raw address if in hyp. So, why the lm_alias don't do this?
As for kvm_get_hyp_vector, same thing. We only map the linear map at EL2, so you really need to pick the right set of VAs, or kern_hyp_va is going to point you to lalaland (and that will be pretty final).
the hyp runs in el2 and use lm_aliaed address?
I cannot page 4.9 in at the moment, but some limited investigation should help you finding out how we used to map the kernel last year.
Thanks,
M.
On 01/31/2018 09:19 PM, Alex Shi wrote:
BTW, kvm_ksym_ref will return address - kimg or raw address if in hyp. So, why the lm_alias don't do this?
Sorry, lm_alias will return different address by checking if it's lm address. Guess I am too tried to read.
From: Marc Zyngier marc.zyngier@arm.com
Now that we have per-CPU vectors, let's plug then in the KVM/arm64 code.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit f8d72ca9cbbb66ddf464d7befddc6e68d2044e57) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: mv changes from virt/kvm/arm/arm.c to arch/arm/kvm/arm.c and skip vhe support --- arch/arm/include/asm/kvm_mmu.h | 10 ++++++++++ arch/arm/kvm/arm.c | 8 +++++++- arch/arm64/include/asm/kvm_mmu.h | 38 ++++++++++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/switch.c | 2 +- 4 files changed, 56 insertions(+), 2 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h index a58bbaa..d10e362 100644 --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -223,6 +223,16 @@ static inline unsigned int kvm_get_vmid_bits(void) return 8; }
+static inline void *kvm_get_hyp_vector(void) +{ + return kvm_ksym_ref(__kvm_hyp_vector); +} + +static inline int kvm_map_vectors(void) +{ + return 0; +} + #endif /* !__ASSEMBLY__ */
#endif /* __ARM_KVM_MMU_H__ */ diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 19b5f5c..3539fc4 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -1088,7 +1088,7 @@ static void cpu_init_hyp_mode(void *dummy) pgd_ptr = kvm_mmu_get_httbr(); stack_page = __this_cpu_read(kvm_arm_hyp_stack_page); hyp_stack_ptr = stack_page + PAGE_SIZE; - vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector); + vector_ptr = (unsigned long)kvm_get_hyp_vector();
__cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr); __cpu_init_stage2(); @@ -1344,6 +1344,12 @@ static int init_hyp_mode(void) goto out_err; }
+ err = kvm_map_vectors(); + if (err) { + kvm_err("Cannot map vectors\n"); + goto out_err; + } + /* * Map the Hyp stack pages */ diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 6d22017..28ca229 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -313,5 +313,43 @@ static inline unsigned int kvm_get_vmid_bits(void) return (cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8; }
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR +#include <asm/mmu.h> + +static inline void *kvm_get_hyp_vector(void) +{ + struct bp_hardening_data *data = arm64_get_bp_hardening_data(); + void *vect = kvm_ksym_ref(__kvm_hyp_vector); + + if (data->fn) { + vect = __bp_harden_hyp_vecs_start + + data->hyp_vectors_slot * SZ_2K; + + /* if (!has_vhe()) skip vhe support - alex shi */ + vect = lm_alias(vect); + } + + return vect; +} + +static inline int kvm_map_vectors(void) +{ + return create_hyp_mappings(kvm_ksym_ref(__bp_harden_hyp_vecs_start), + kvm_ksym_ref(__bp_harden_hyp_vecs_end), + PAGE_HYP_EXEC); +} + +#else +static inline void *kvm_get_hyp_vector(void) +{ + return kvm_ksym_ref(__kvm_hyp_vector); +} + +static inline int kvm_map_vectors(void) +{ + return 0; +} +#endif + #endif /* __ASSEMBLY__ */ #endif /* __ARM64_KVM_MMU_H__ */ diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 0c848c1..cf6d962 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -50,7 +50,7 @@ static void __hyp_text __activate_traps_vhe(void) val &= ~CPACR_EL1_FPEN; write_sysreg(val, cpacr_el1);
- write_sysreg(__kvm_hyp_vector, vbar_el1); + write_sysreg(kvm_get_hyp_vector(), vbar_el1); }
static void __hyp_text __activate_traps_nvhe(void)
From: Marc Zyngier marc.zyngier@arm.com
For those CPUs that require PSCI to perform a BP invalidation, going all the way to the PSCI code for not much is a waste of precious cycles. Let's terminate that call as early as possible.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 7e5f53131d8fc3918d9007977122d15a8579c9c6) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/kvm/hyp/switch.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index cf6d962..3eab6ac 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -17,6 +17,7 @@
#include <linux/types.h> #include <linux/jump_label.h> +#include <uapi/linux/psci.h>
#include <asm/kvm_asm.h> #include <asm/kvm_emulate.h> @@ -308,6 +309,18 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu)) goto again;
+ if (exit_code == ARM_EXCEPTION_TRAP && + (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC64 || + kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32) && + vcpu_get_reg(vcpu, 0) == PSCI_0_2_FN_PSCI_VERSION) { + u64 val = PSCI_RET_NOT_SUPPORTED; + if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features)) + val = 2; + + vcpu_set_reg(vcpu, 0, val); + goto again; + } + if (static_branch_unlikely(&vgic_v2_cpuif_trap) && exit_code == ARM_EXCEPTION_TRAP) { bool valid;
From: Marc Zyngier marc.zyngier@arm.com
Some minor erratum may not be fixed in further revisions of a core, leading to a situation where the workaround needs to be updated each time an updated core is released.
Introduce a MIDR_ALL_VERSIONS match helper that will work for all versions of that MIDR, once and for all.
Acked-by: Thomas Gleixner tglx@linutronix.de Acked-by: Mark Rutland mark.rutland@arm.com Acked-by: Daniel Lezcano daniel.lezcano@linaro.org Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com (cherry picked from commit 06f1494f837da8997d670a1ba87add7963b08922) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/kernel/cpu_errata.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 87b0163..75ad384 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -127,6 +127,13 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry, .midr_range_min = min, \ .midr_range_max = max
+#define MIDR_ALL_VERSIONS(model) \ + .def_scope = SCOPE_LOCAL_CPU, \ + .matches = is_affected_midr_range, \ + .midr_model = model, \ + .midr_range_min = 0, \ + .midr_range_max = (MIDR_VARIANT_MASK | MIDR_REVISION_MASK) + const struct arm64_cpu_capabilities arm64_errata[] = { #if defined(CONFIG_ARM64_ERRATUM_826319) || \ defined(CONFIG_ARM64_ERRATUM_827319) || \
From: Will Deacon will.deacon@arm.com
Hook up MIDR values for the Cortex-A72 and Cortex-A75 CPUs, since they will soon need MIDR matches for hardening the branch predictor.
Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit f0be3364335d47267aa1f7c5ed5faaa59c70db13) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: added A73 in arch/arm64/include/asm/cputype.h too. --- arch/arm64/include/asm/cputype.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 26a68dd..0843b3f 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -75,7 +75,10 @@ #define ARM_CPU_PART_AEM_V8 0xD0F #define ARM_CPU_PART_FOUNDATION 0xD00 #define ARM_CPU_PART_CORTEX_A57 0xD07 +#define ARM_CPU_PART_CORTEX_A72 0xD08 #define ARM_CPU_PART_CORTEX_A53 0xD03 +#define ARM_CPU_PART_CORTEX_A73 0xD09 +#define ARM_CPU_PART_CORTEX_A75 0xD0A
#define APM_CPU_PART_POTENZA 0x000
@@ -86,6 +89,9 @@
#define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) +#define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72) +#define MIDR_CORTEX_A73 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A73) +#define MIDR_CORTEX_A75 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A75) #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
From: Will Deacon will.deacon@arm.com
Cortex-A57, A72, A73 and A75 are susceptible to branch predictor aliasing and can theoretically be attacked by malicious code.
This patch implements a PSCI-based mitigation for these CPUs when available. The call into firmware will invalidate the branch predictor state, preventing any malicious entries from affecting other victim contexts.
Co-developed-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit d3d2867721c46f886ba90ee3706fd664be0bf086) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/kernel/bpi.S | 24 ++++++++++++++++++++++++ arch/arm64/kernel/cpu_errata.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+)
diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S index 06a931e..dec95bd 100644 --- a/arch/arm64/kernel/bpi.S +++ b/arch/arm64/kernel/bpi.S @@ -53,3 +53,27 @@ ENTRY(__bp_harden_hyp_vecs_start) vectors __kvm_hyp_vector .endr ENTRY(__bp_harden_hyp_vecs_end) +ENTRY(__psci_hyp_bp_inval_start) + sub sp, sp, #(8 * 18) + stp x16, x17, [sp, #(16 * 0)] + stp x14, x15, [sp, #(16 * 1)] + stp x12, x13, [sp, #(16 * 2)] + stp x10, x11, [sp, #(16 * 3)] + stp x8, x9, [sp, #(16 * 4)] + stp x6, x7, [sp, #(16 * 5)] + stp x4, x5, [sp, #(16 * 6)] + stp x2, x3, [sp, #(16 * 7)] + stp x0, x1, [sp, #(16 * 8)] + mov x0, #0x84000000 + smc #0 + ldp x16, x17, [sp, #(16 * 0)] + ldp x14, x15, [sp, #(16 * 1)] + ldp x12, x13, [sp, #(16 * 2)] + ldp x10, x11, [sp, #(16 * 3)] + ldp x8, x9, [sp, #(16 * 4)] + ldp x6, x7, [sp, #(16 * 5)] + ldp x4, x5, [sp, #(16 * 6)] + ldp x2, x3, [sp, #(16 * 7)] + ldp x0, x1, [sp, #(16 * 8)] + add sp, sp, #(8 * 18) +ENTRY(__psci_hyp_bp_inval_end) diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 75ad384..a3c60b2 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -53,6 +53,8 @@ static int cpu_enable_trap_ctr_access(void *__unused) DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
#ifdef CONFIG_KVM +extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[]; + static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start, const char *hyp_vecs_end) { @@ -94,6 +96,9 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn, spin_unlock(&bp_lock); } #else +#define __psci_hyp_bp_inval_start NULL +#define __psci_hyp_bp_inval_end NULL + static void __install_bp_hardening_cb(bp_hardening_cb_t fn, const char *hyp_vecs_start, const char *hyp_vecs_end) @@ -118,6 +123,21 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
__install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end); } + +#include <linux/psci.h> + +static int enable_psci_bp_hardening(void *data) +{ + const struct arm64_cpu_capabilities *entry = data; + + if (psci_ops.get_version) + install_bp_hardening_cb(entry, + (bp_hardening_cb_t)psci_ops.get_version, + __psci_hyp_bp_inval_start, + __psci_hyp_bp_inval_end); + + return 0; +} #endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */
#define MIDR_RANGE(model, min, max) \ @@ -211,6 +231,28 @@ const struct arm64_cpu_capabilities arm64_errata[] = { .def_scope = SCOPE_LOCAL_CPU, .enable = cpu_enable_trap_ctr_access, }, +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_CORTEX_A57), + .enable = enable_psci_bp_hardening, + }, + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_CORTEX_A72), + .enable = enable_psci_bp_hardening, + }, + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_CORTEX_A73), + .enable = enable_psci_bp_hardening, + }, + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_CORTEX_A75), + .enable = enable_psci_bp_hardening, + }, +#endif { } };
From: Jayachandran C jnair@caviumnetworks.com
Add the older Broadcom ID as well as the new Cavium ID for ThunderX2 CPUs.
Signed-off-by: Jayachandran C jnair@caviumnetworks.com Signed-off-by: Will Deacon will.deacon@arm.com (cherry picked from commit 7f08db7f8e176eaf936b87a8b5223482fd025f3c) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm64/include/asm/cputype.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 0843b3f..9ee3038 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -84,6 +84,7 @@
#define CAVIUM_CPU_PART_THUNDERX 0x0A1 #define CAVIUM_CPU_PART_THUNDERX_81XX 0x0A2 +#define CAVIUM_CPU_PART_THUNDERX2 0x0AF
#define BRCM_CPU_PART_VULCAN 0x516
@@ -94,6 +95,8 @@ #define MIDR_CORTEX_A75 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A75) #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) +#define MIDR_CAVIUM_THUNDERX2 MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX2) +#define MIDR_BRCM_VULCAN MIDR_CPU_MODEL(ARM_CPU_IMP_BRCM, BRCM_CPU_PART_VULCAN)
#ifndef __ASSEMBLY__
From: Marc Zyngier marc.zyngier@arm.com
In order to avoid aliasing attacks against the branch predictor, some implementations require to invalidate the BTB when switching from one user context to another.
For this, we reuse the existing implementation for Cortex-A8, and apply it to A9, A12 and A17.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com (cherry picked from commit efcd0e857a656bbd1c1da15ff984ad6402332c61) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm/mm/proc-v7-2level.S | 4 ++-- arch/arm/mm/proc-v7-3level.S | 6 ++++++ arch/arm/mm/proc-v7.S | 30 +++++++++++++++--------------- 3 files changed, 23 insertions(+), 17 deletions(-)
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S index c6141a5..0422e58b 100644 --- a/arch/arm/mm/proc-v7-2level.S +++ b/arch/arm/mm/proc-v7-2level.S @@ -41,7 +41,7 @@ * even on Cortex-A8 revisions not affected by 430973. * If IBE is not set, the flush BTAC/BTB won't do anything. */ -ENTRY(cpu_ca8_switch_mm) +ENTRY(cpu_v7_btbinv_switch_mm) #ifdef CONFIG_MMU mov r2, #0 mcr p15, 0, r2, c7, c5, 6 @ flush BTAC/BTB @@ -66,7 +66,7 @@ ENTRY(cpu_v7_switch_mm) #endif bx lr ENDPROC(cpu_v7_switch_mm) -ENDPROC(cpu_ca8_switch_mm) +ENDPROC(cpu_v7_btbinv_switch_mm)
/* * cpu_v7_set_pte_ext(ptep, pte) diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S index 5e5720e..1537c6d 100644 --- a/arch/arm/mm/proc-v7-3level.S +++ b/arch/arm/mm/proc-v7-3level.S @@ -54,6 +54,11 @@ * Set the translation table base pointer to be pgd_phys (physical address of * the new TTB). */ +ENTRY(cpu_v7_btbinv_switch_mm) +#ifdef CONFIG_MMU + mov r2, #0 + mcr p15, 0, r2, c7, c5, 6 @ flush BTAC/BTB +#endif ENTRY(cpu_v7_switch_mm) #ifdef CONFIG_MMU mmid r2, r2 @@ -64,6 +69,7 @@ ENTRY(cpu_v7_switch_mm) #endif ret lr ENDPROC(cpu_v7_switch_mm) +ENDPROC(cpu_v7_btbinv_switch_mm)
#ifdef __ARMEB__ #define rl r3 diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S index d00d52c..379d178 100644 --- a/arch/arm/mm/proc-v7.S +++ b/arch/arm/mm/proc-v7.S @@ -154,18 +154,18 @@ ENDPROC(cpu_v7_do_resume) #endif
/* - * Cortex-A8 + * Cortex-A8/A12/A17 that require a BTB invalidation on switch_mm */ - globl_equ cpu_ca8_proc_init, cpu_v7_proc_init - globl_equ cpu_ca8_proc_fin, cpu_v7_proc_fin - globl_equ cpu_ca8_reset, cpu_v7_reset - globl_equ cpu_ca8_do_idle, cpu_v7_do_idle - globl_equ cpu_ca8_dcache_clean_area, cpu_v7_dcache_clean_area - globl_equ cpu_ca8_set_pte_ext, cpu_v7_set_pte_ext - globl_equ cpu_ca8_suspend_size, cpu_v7_suspend_size + globl_equ cpu_v7_btbinv_proc_init, cpu_v7_proc_init + globl_equ cpu_v7_btbinv_proc_fin, cpu_v7_proc_fin + globl_equ cpu_v7_btbinv_reset, cpu_v7_reset + globl_equ cpu_v7_btbinv_do_idle, cpu_v7_do_idle + globl_equ cpu_v7_btbinv_dcache_clean_area, cpu_v7_dcache_clean_area + globl_equ cpu_v7_btbinv_set_pte_ext, cpu_v7_set_pte_ext + globl_equ cpu_v7_btbinv_suspend_size, cpu_v7_suspend_size #ifdef CONFIG_ARM_CPU_SUSPEND - globl_equ cpu_ca8_do_suspend, cpu_v7_do_suspend - globl_equ cpu_ca8_do_resume, cpu_v7_do_resume + globl_equ cpu_v7_btbinv_do_suspend, cpu_v7_do_suspend + globl_equ cpu_v7_btbinv_do_resume, cpu_v7_do_resume #endif
/* @@ -176,7 +176,7 @@ ENDPROC(cpu_v7_do_resume) globl_equ cpu_ca9mp_reset, cpu_v7_reset globl_equ cpu_ca9mp_do_idle, cpu_v7_do_idle globl_equ cpu_ca9mp_dcache_clean_area, cpu_v7_dcache_clean_area - globl_equ cpu_ca9mp_switch_mm, cpu_v7_switch_mm + globl_equ cpu_ca9mp_switch_mm, cpu_v7_btbinv_switch_mm globl_equ cpu_ca9mp_set_pte_ext, cpu_v7_set_pte_ext .globl cpu_ca9mp_suspend_size .equ cpu_ca9mp_suspend_size, cpu_v7_suspend_size + 4 * 2 @@ -543,8 +543,8 @@ __v7_setup_stack:
@ define struct processor (see <asm/proc-fns.h> and proc-macros.S) define_processor_functions v7, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + define_processor_functions v7_btbinv, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 #ifndef CONFIG_ARM_LPAE - define_processor_functions ca8, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 define_processor_functions ca9mp, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 #endif #ifdef CONFIG_CPU_PJ4B @@ -609,7 +609,7 @@ __v7_ca9mp_proc_info: __v7_ca8_proc_info: .long 0x410fc080 .long 0xff0ffff0 - __v7_proc __v7_ca8_proc_info, __v7_setup, proc_fns = ca8_processor_functions + __v7_proc __v7_ca8_proc_info, __v7_setup, proc_fns = v7_btbinv_processor_functions .size __v7_ca8_proc_info, . - __v7_ca8_proc_info
#endif /* CONFIG_ARM_LPAE */ @@ -653,7 +653,7 @@ __v7_ca7mp_proc_info: __v7_ca12mp_proc_info: .long 0x410fc0d0 .long 0xff0ffff0 - __v7_proc __v7_ca12mp_proc_info, __v7_ca12mp_setup + __v7_proc __v7_ca12mp_proc_info, __v7_ca12mp_setup, proc_fns = v7_btbinv_processor_functions .size __v7_ca12mp_proc_info, . - __v7_ca12mp_proc_info
/* @@ -683,7 +683,7 @@ __v7_b15mp_proc_info: __v7_ca17mp_proc_info: .long 0x410fc0e0 .long 0xff0ffff0 - __v7_proc __v7_ca17mp_proc_info, __v7_ca17mp_setup + __v7_proc __v7_ca17mp_proc_info, __v7_ca17mp_setup, proc_fns = v7_btbinv_processor_functions .size __v7_ca17mp_proc_info, . - __v7_ca17mp_proc_info
/*
From: Marc Zyngier marc.zyngier@arm.com
In order to avoid aliasing attacks against the branch predictor, let's invalidate the BTB on guest exit. This is made complicated by the fact that we cannot take a branch before invalidating the BTB.
Another thing is that we perform the invalidation on all implementations, no matter if they are affected or not.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com (cherry picked from commit 8f4c8defa0cfc2c8d53d095eeda6c6ad1440201b) Signed-off-by: Alex Shi alex.shi@linaro.org
Conflicts: patch on old hyp_hvc: in arch/arm/kvm/hyp/hyp-entry.S --- arch/arm/include/asm/kvm_asm.h | 2 -- arch/arm/include/asm/kvm_mmu.h | 13 ++++++++- arch/arm/kvm/hyp/hyp-entry.S | 64 ++++++++++++++++++++++++++++++++++++++++-- 3 files changed, 74 insertions(+), 5 deletions(-)
diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h index 8ef0538..24f3ec7 100644 --- a/arch/arm/include/asm/kvm_asm.h +++ b/arch/arm/include/asm/kvm_asm.h @@ -61,8 +61,6 @@ struct kvm_vcpu; extern char __kvm_hyp_init[]; extern char __kvm_hyp_init_end[];
-extern char __kvm_hyp_vector[]; - extern void __kvm_flush_vm_context(void); extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa); extern void __kvm_tlb_flush_vmid(struct kvm *kvm); diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h index d10e362..de520c9 100644 --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -37,6 +37,7 @@
#include <linux/highmem.h> #include <asm/cacheflush.h> +#include <asm/cputype.h> #include <asm/pgalloc.h> #include <asm/stage2_pgtable.h>
@@ -225,7 +226,17 @@ static inline unsigned int kvm_get_vmid_bits(void)
static inline void *kvm_get_hyp_vector(void) { - return kvm_ksym_ref(__kvm_hyp_vector); + extern char __kvm_hyp_vector[]; + extern char __kvm_hyp_vector_bp_inv[]; + + switch(read_cpuid_part()) { + case ARM_CPU_PART_CORTEX_A12: + case ARM_CPU_PART_CORTEX_A17: + return kvm_ksym_ref(__kvm_hyp_vector_bp_inv); + + default: + return kvm_ksym_ref(__kvm_hyp_vector); + } }
static inline int kvm_map_vectors(void) diff --git a/arch/arm/kvm/hyp/hyp-entry.S b/arch/arm/kvm/hyp/hyp-entry.S index 96beb53..6ac69c69 100644 --- a/arch/arm/kvm/hyp/hyp-entry.S +++ b/arch/arm/kvm/hyp/hyp-entry.S @@ -70,6 +70,59 @@ __kvm_hyp_vector: W(b) hyp_hvc W(b) hyp_irq W(b) hyp_fiq + + .align 5 +__kvm_hyp_vector_bp_inv: + .global __kvm_hyp_vector_bp_inv + + /* + * We encode the exception entry in the bottom 3 bits of + * SP, and we have to guarantee to be 8 bytes aligned. + */ + W(add) sp, sp, #1 /* Reset 7 */ + W(add) sp, sp, #1 /* Undef 6 */ + W(add) sp, sp, #1 /* Syscall 5 */ + W(add) sp, sp, #1 /* Prefetch abort 4 */ + W(add) sp, sp, #1 /* Data abort 3 */ + W(add) sp, sp, #1 /* HVC 2 */ + W(add) sp, sp, #1 /* IRQ 1 */ + W(add) sp, sp, #1 /* FIQ 0 */ + + sub sp, sp, #1 + + mcr p15, 0, r0, c7, c5, 6 /* BPIALL */ + isb + + /* + * Yet another silly hack: Use VPIDR as a temp register. + * Thumb2 is really a pain, as SP cannot be used with most + * of the bitwise instructions. The vect_br macro ensures + * things gets cleaned-up. + */ + mcr p15, 4, r0, c0, c0, 0 /* VPIDR */ + mov r0, sp + and r0, r0, #7 + sub sp, sp, r0 + push {r1, r2} + mov r1, r0 + mrc p15, 4, r0, c0, c0, 0 /* VPIDR */ + mrc p15, 0, r2, c0, c0, 0 /* MIDR */ + mcr p15, 4, r2, c0, c0, 0 /* VPIDR */ + +.macro vect_br val, targ + cmp r1, #\val + popeq {r1, r2} + beq \targ +.endm + + vect_br 0, hyp_fiq + vect_br 1, hyp_irq + vect_br 2, hyp_hvc + vect_br 3, hyp_dabt + vect_br 4, hyp_pabt + vect_br 5, hyp_svc + vect_br 6, hyp_undef + vect_br 7, hyp_reset
.macro invalid_vector label, cause .align @@ -131,7 +184,14 @@ hyp_hvc: mrceq p15, 4, r0, c12, c0, 0 @ get HVBAR beq 1f
- push {lr} + /* + * Pushing r2 here is just a way of keeping the stack aligned to + * 8 bytes on any path that can trigger a HYP exception. Here, + * we may well be about to jump into the guest, and the guest + * exit would otherwise be badly decoded by our fancy + * "decode-exception-without-a-branch" code... + */ + push {r2, lr}
mov lr, r0 mov r0, r1 @@ -141,7 +201,7 @@ hyp_hvc: THUMB( orr lr, #1) blx lr @ Call the HYP function
- pop {lr} + pop {r2, lr} 1: eret
guest_trap:
From: Marc Zyngier marc.zyngier@arm.com
In order to avoid aliasing attacks against the branch predictor, Cortex-A15 require to invalidate the BTB when switching from one user context to another. The only way to do so on this CPU is to perform an ICIALLU, having set ACTLR[0] to 1 from secure mode.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com (cherry picked from commit e968fd4da5b7d0b3fe29613cedfd5d587c492a4e) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm/mm/proc-v7-2level.S | 10 ++++++++++ arch/arm/mm/proc-v7-3level.S | 16 ++++++++++++++++ arch/arm/mm/proc-v7.S | 18 +++++++++++++++++- 3 files changed, 43 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S index 0422e58b..7dc9e1c 100644 --- a/arch/arm/mm/proc-v7-2level.S +++ b/arch/arm/mm/proc-v7-2level.S @@ -40,7 +40,17 @@ * Note that we always need to flush BTAC/BTB if IBE is set * even on Cortex-A8 revisions not affected by 430973. * If IBE is not set, the flush BTAC/BTB won't do anything. + * + * Cortex-A15 requires ACTLR[0] to be set from secure in order + * for the icache invalidation to also invalidate the BTB. */ +ENTRY(cpu_ca15_switch_mm) +#ifdef CONFIG_MMU + mcr p15, 0, r0, c7, c5, 0 @ ICIALLU + isb + b cpu_v7_switch_mm +#endif +ENDPROC(cpu_ca15_switch_mm) ENTRY(cpu_v7_btbinv_switch_mm) #ifdef CONFIG_MMU mov r2, #0 diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S index 1537c6d..b2d9909 100644 --- a/arch/arm/mm/proc-v7-3level.S +++ b/arch/arm/mm/proc-v7-3level.S @@ -71,6 +71,22 @@ ENTRY(cpu_v7_switch_mm) ENDPROC(cpu_v7_switch_mm) ENDPROC(cpu_v7_btbinv_switch_mm)
+/* + * Cortex-A15 requires ACTLR[0] to be set from secure in order + * for the icache invalidation to also invalidate the BTB. + */ +ENTRY(cpu_ca15_switch_mm) +#ifdef CONFIG_MMU + mcr p15, 0, r0, c7, c5, 0 @ ICIALLU + mmid r2, r2 + asid r2, r2 + orr rpgdh, rpgdh, r2, lsl #(48 - 32) @ upper 32-bits of pgd + mcrr p15, 0, rpgdl, rpgdh, c2 @ set TTB 0 + isb +#endif + ret lr +ENDPROC(cpu_ca15_switch_mm) + #ifdef __ARMEB__ #define rl r3 #define rh r2 diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S index 379d178..d799040 100644 --- a/arch/arm/mm/proc-v7.S +++ b/arch/arm/mm/proc-v7.S @@ -169,6 +169,21 @@ ENDPROC(cpu_v7_do_resume) #endif
/* + * Cortex-A15 that require an icache invalidation on switch_mm + */ + globl_equ cpu_ca15_proc_init, cpu_v7_proc_init + globl_equ cpu_ca15_proc_fin, cpu_v7_proc_fin + globl_equ cpu_ca15_reset, cpu_v7_reset + globl_equ cpu_ca15_do_idle, cpu_v7_do_idle + globl_equ cpu_ca15_dcache_clean_area, cpu_v7_dcache_clean_area + globl_equ cpu_ca15_set_pte_ext, cpu_v7_set_pte_ext + globl_equ cpu_ca15_suspend_size, cpu_v7_suspend_size +#ifdef CONFIG_ARM_CPU_SUSPEND + globl_equ cpu_ca15_do_suspend, cpu_v7_do_suspend + globl_equ cpu_ca15_do_resume, cpu_v7_do_resume +#endif + +/* * Cortex-A9 processor functions */ globl_equ cpu_ca9mp_proc_init, cpu_v7_proc_init @@ -544,6 +559,7 @@ __v7_setup_stack: @ define struct processor (see <asm/proc-fns.h> and proc-macros.S) define_processor_functions v7, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 define_processor_functions v7_btbinv, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + define_processor_functions ca15, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 #ifndef CONFIG_ARM_LPAE define_processor_functions ca9mp, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 #endif @@ -663,7 +679,7 @@ __v7_ca12mp_proc_info: __v7_ca15mp_proc_info: .long 0x410fc0f0 .long 0xff0ffff0 - __v7_proc __v7_ca15mp_proc_info, __v7_ca15mp_setup + __v7_proc __v7_ca15mp_proc_info, __v7_ca15mp_setup, proc_fns = ca15_processor_functions .size __v7_ca15mp_proc_info, . - __v7_ca15mp_proc_info
/*
From: Marc Zyngier marc.zyngier@arm.com
In order to avoid aliasing attacks against the branch predictor on Cortex-A15, let's invalidate the BTB on guest exit, which can only be done by invalidating the icache (with ACTLR[0] being set).
We use the same hack as for A12/A17 to perform the vector decoding.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com (cherry picked from commit 0055b16009defc9fe56bbbd77714b0d3841c7dd8) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm/include/asm/kvm_mmu.h | 4 ++++ arch/arm/kvm/hyp/hyp-entry.S | 27 ++++++++++++++++++++++++++- 2 files changed, 30 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h index de520c9..f002ef8 100644 --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -228,12 +228,16 @@ static inline void *kvm_get_hyp_vector(void) { extern char __kvm_hyp_vector[]; extern char __kvm_hyp_vector_bp_inv[]; + extern char __kvm_hyp_vector_ic_inv[];
switch(read_cpuid_part()) { case ARM_CPU_PART_CORTEX_A12: case ARM_CPU_PART_CORTEX_A17: return kvm_ksym_ref(__kvm_hyp_vector_bp_inv);
+ case ARM_CPU_PART_CORTEX_A15: + return kvm_ksym_ref(__kvm_hyp_vector_ic_inv); + default: return kvm_ksym_ref(__kvm_hyp_vector); } diff --git a/arch/arm/kvm/hyp/hyp-entry.S b/arch/arm/kvm/hyp/hyp-entry.S index 6ac69c69..0d82ac6 100644 --- a/arch/arm/kvm/hyp/hyp-entry.S +++ b/arch/arm/kvm/hyp/hyp-entry.S @@ -70,7 +70,31 @@ __kvm_hyp_vector: W(b) hyp_hvc W(b) hyp_irq W(b) hyp_fiq - + + .align 5 +__kvm_hyp_vector_ic_inv: + .global __kvm_hyp_vector_ic_inv + + /* + * We encode the exception entry in the bottom 3 bits of + * SP, and we have to guarantee to be 8 bytes aligned. + */ + W(add) sp, sp, #1 /* Reset 7 */ + W(add) sp, sp, #1 /* Undef 6 */ + W(add) sp, sp, #1 /* Syscall 5 */ + W(add) sp, sp, #1 /* Prefetch abort 4 */ + W(add) sp, sp, #1 /* Data abort 3 */ + W(add) sp, sp, #1 /* HVC 2 */ + W(add) sp, sp, #1 /* IRQ 1 */ + W(add) sp, sp, #1 /* FIQ 0 */ + + sub sp, sp, #1 + + mcr p15, 0, r0, c7, c5, 0 /* ICIALLU */ + isb + + b decode_vectors + .align 5 __kvm_hyp_vector_bp_inv: .global __kvm_hyp_vector_bp_inv @@ -93,6 +117,7 @@ __kvm_hyp_vector_bp_inv: mcr p15, 0, r0, c7, c5, 6 /* BPIALL */ isb
+decode_vectors: /* * Yet another silly hack: Use VPIDR as a temp register. * Thumb2 is really a pain, as SP cannot be used with most
From: Marc Zyngier marc.zyngier@arm.com
In order to prevent aliasing attacks on the branch predictor, invalidate the BTB on CPUs that are known to be affected when taking a prefetch abort on a address that is outside of a user task limit.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com (cherry picked from commit a07373c8c365746583f25f49fee41b1bc0ff94b2) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm/include/asm/cp15.h | 2 ++ arch/arm/mm/fault.c | 25 +++++++++++++++++ arch/arm/mm/fsr-2level.c | 4 +-- arch/arm/mm/fsr-3level.c | 67 ++++++++++++++++++++++++++++++++++++++++++++- 4 files changed, 95 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/cp15.h b/arch/arm/include/asm/cp15.h index dbdbce1..0672ddc 100644 --- a/arch/arm/include/asm/cp15.h +++ b/arch/arm/include/asm/cp15.h @@ -64,6 +64,8 @@ #define __write_sysreg(v, r, w, c, t) asm volatile(w " " c : : "r" ((t)(v))) #define write_sysreg(v, ...) __write_sysreg(v, __VA_ARGS__)
+#define BPIALL __ACCESS_CP15(c7, 0, c5, 6) + extern unsigned long cr_alignment; /* defined in entry-armv.S */
static inline unsigned long get_cr(void) diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index f7861dc..cb3b1fa 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -20,6 +20,7 @@ #include <linux/highmem.h> #include <linux/perf_event.h>
+#include <asm/cp15.h> #include <asm/exception.h> #include <asm/pgtable.h> #include <asm/system_misc.h> @@ -180,6 +181,7 @@ __do_user_fault(struct task_struct *tsk, unsigned long addr, si.si_errno = 0; si.si_code = code; si.si_addr = (void __user *)addr; + force_sig_info(sig, &si, tsk); }
@@ -395,12 +397,35 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) __do_kernel_fault(mm, addr, fsr, regs); return 0; } + +static int +do_pabt_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) +{ + if (addr > TASK_SIZE) { + switch(read_cpuid_part()) { + case ARM_CPU_PART_CORTEX_A8: + case ARM_CPU_PART_CORTEX_A9: + case ARM_CPU_PART_CORTEX_A12: + case ARM_CPU_PART_CORTEX_A17: + write_sysreg(0, BPIALL); + break; + } + } + + return do_page_fault(addr, fsr, regs); +} #else /* CONFIG_MMU */ static int do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) { return 0; } + +static int +do_pabt_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) +{ + return 0; +} #endif /* CONFIG_MMU */
/* diff --git a/arch/arm/mm/fsr-2level.c b/arch/arm/mm/fsr-2level.c index 18ca74c..4cede9b 100644 --- a/arch/arm/mm/fsr-2level.c +++ b/arch/arm/mm/fsr-2level.c @@ -50,7 +50,7 @@ static struct fsr_info ifsr_info[] = { { do_bad, SIGBUS, 0, "unknown 4" }, { do_translation_fault, SIGSEGV, SEGV_MAPERR, "section translation fault" }, { do_bad, SIGSEGV, SEGV_ACCERR, "page access flag fault" }, - { do_page_fault, SIGSEGV, SEGV_MAPERR, "page translation fault" }, + { do_pabt_page_fault, SIGSEGV, SEGV_MAPERR, "page translation fault" }, { do_bad, SIGBUS, 0, "external abort on non-linefetch" }, { do_bad, SIGSEGV, SEGV_ACCERR, "section domain fault" }, { do_bad, SIGBUS, 0, "unknown 10" }, @@ -58,7 +58,7 @@ static struct fsr_info ifsr_info[] = { { do_bad, SIGBUS, 0, "external abort on translation" }, { do_sect_fault, SIGSEGV, SEGV_ACCERR, "section permission fault" }, { do_bad, SIGBUS, 0, "external abort on translation" }, - { do_page_fault, SIGSEGV, SEGV_ACCERR, "page permission fault" }, + { do_pabt_page_fault, SIGSEGV, SEGV_ACCERR, "page permission fault" }, { do_bad, SIGBUS, 0, "unknown 16" }, { do_bad, SIGBUS, 0, "unknown 17" }, { do_bad, SIGBUS, 0, "unknown 18" }, diff --git a/arch/arm/mm/fsr-3level.c b/arch/arm/mm/fsr-3level.c index ab4409a..3158340 100644 --- a/arch/arm/mm/fsr-3level.c +++ b/arch/arm/mm/fsr-3level.c @@ -65,4 +65,69 @@ static struct fsr_info fsr_info[] = { { do_bad, SIGBUS, 0, "unknown 63" }, };
-#define ifsr_info fsr_info +static struct fsr_info ifsr_info[] = { + { do_bad, SIGBUS, 0, "unknown 0" }, + { do_bad, SIGBUS, 0, "unknown 1" }, + { do_bad, SIGBUS, 0, "unknown 2" }, + { do_bad, SIGBUS, 0, "unknown 3" }, + { do_bad, SIGBUS, 0, "reserved translation fault" }, + { do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 1 translation fault" }, + { do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 2 translation fault" }, + { do_pabt_page_fault, SIGSEGV, SEGV_MAPERR, "level 3 translation fault" }, + { do_bad, SIGBUS, 0, "reserved access flag fault" }, + { do_bad, SIGSEGV, SEGV_ACCERR, "level 1 access flag fault" }, + { do_pabt_page_fault, SIGSEGV, SEGV_ACCERR, "level 2 access flag fault" }, + { do_pabt_page_fault, SIGSEGV, SEGV_ACCERR, "level 3 access flag fault" }, + { do_bad, SIGBUS, 0, "reserved permission fault" }, + { do_bad, SIGSEGV, SEGV_ACCERR, "level 1 permission fault" }, + { do_pabt_page_fault, SIGSEGV, SEGV_ACCERR, "level 2 permission fault" }, + { do_pabt_page_fault, SIGSEGV, SEGV_ACCERR, "level 3 permission fault" }, + { do_bad, SIGBUS, 0, "synchronous external abort" }, + { do_bad, SIGBUS, 0, "asynchronous external abort" }, + { do_bad, SIGBUS, 0, "unknown 18" }, + { do_bad, SIGBUS, 0, "unknown 19" }, + { do_bad, SIGBUS, 0, "synchronous abort (translation table walk)" }, + { do_bad, SIGBUS, 0, "synchronous abort (translation table walk)" }, + { do_bad, SIGBUS, 0, "synchronous abort (translation table walk)" }, + { do_bad, SIGBUS, 0, "synchronous abort (translation table walk)" }, + { do_bad, SIGBUS, 0, "synchronous parity error" }, + { do_bad, SIGBUS, 0, "asynchronous parity error" }, + { do_bad, SIGBUS, 0, "unknown 26" }, + { do_bad, SIGBUS, 0, "unknown 27" }, + { do_bad, SIGBUS, 0, "synchronous parity error (translation table walk" }, + { do_bad, SIGBUS, 0, "synchronous parity error (translation table walk" }, + { do_bad, SIGBUS, 0, "synchronous parity error (translation table walk" }, + { do_bad, SIGBUS, 0, "synchronous parity error (translation table walk" }, + { do_bad, SIGBUS, 0, "unknown 32" }, + { do_bad, SIGBUS, BUS_ADRALN, "alignment fault" }, + { do_bad, SIGBUS, 0, "debug event" }, + { do_bad, SIGBUS, 0, "unknown 35" }, + { do_bad, SIGBUS, 0, "unknown 36" }, + { do_bad, SIGBUS, 0, "unknown 37" }, + { do_bad, SIGBUS, 0, "unknown 38" }, + { do_bad, SIGBUS, 0, "unknown 39" }, + { do_bad, SIGBUS, 0, "unknown 40" }, + { do_bad, SIGBUS, 0, "unknown 41" }, + { do_bad, SIGBUS, 0, "unknown 42" }, + { do_bad, SIGBUS, 0, "unknown 43" }, + { do_bad, SIGBUS, 0, "unknown 44" }, + { do_bad, SIGBUS, 0, "unknown 45" }, + { do_bad, SIGBUS, 0, "unknown 46" }, + { do_bad, SIGBUS, 0, "unknown 47" }, + { do_bad, SIGBUS, 0, "unknown 48" }, + { do_bad, SIGBUS, 0, "unknown 49" }, + { do_bad, SIGBUS, 0, "unknown 50" }, + { do_bad, SIGBUS, 0, "unknown 51" }, + { do_bad, SIGBUS, 0, "implementation fault (lockdown abort)" }, + { do_bad, SIGBUS, 0, "unknown 53" }, + { do_bad, SIGBUS, 0, "unknown 54" }, + { do_bad, SIGBUS, 0, "unknown 55" }, + { do_bad, SIGBUS, 0, "unknown 56" }, + { do_bad, SIGBUS, 0, "unknown 57" }, + { do_bad, SIGBUS, 0, "implementation fault (coprocessor abort)" }, + { do_bad, SIGBUS, 0, "unknown 59" }, + { do_bad, SIGBUS, 0, "unknown 60" }, + { do_bad, SIGBUS, 0, "unknown 61" }, + { do_bad, SIGBUS, 0, "unknown 62" }, + { do_bad, SIGBUS, 0, "unknown 63" }, +};
From: Marc Zyngier marc.zyngier@arm.com
In order to prevent aliasing attacks on the branch predictor, invalidate the icache on Cortex-A15, which has the side effect of invalidating the BTB. This requires ACTLR[0] to be set to 1 (secure operation).
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com (cherry picked from commit 1710fcb6a220f485c2764e1384800ebdf8a13aed) Signed-off-by: Alex Shi alex.shi@linaro.org --- arch/arm/include/asm/cp15.h | 1 + arch/arm/mm/fault.c | 4 ++++ 2 files changed, 5 insertions(+)
diff --git a/arch/arm/include/asm/cp15.h b/arch/arm/include/asm/cp15.h index 0672ddc..b74b174 100644 --- a/arch/arm/include/asm/cp15.h +++ b/arch/arm/include/asm/cp15.h @@ -65,6 +65,7 @@ #define write_sysreg(v, ...) __write_sysreg(v, __VA_ARGS__)
#define BPIALL __ACCESS_CP15(c7, 0, c5, 6) +#define ICIALLU __ACCESS_CP15(c7, 0, c5, 0)
extern unsigned long cr_alignment; /* defined in entry-armv.S */
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index cb3b1fa..e282586 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -409,6 +409,10 @@ do_pabt_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) case ARM_CPU_PART_CORTEX_A17: write_sysreg(0, BPIALL); break; + + case ARM_CPU_PART_CORTEX_A15: + write_sysreg(0, ICIALLU); + break; } }
linaro-kernel@lists.linaro.org