This is a note to let you know that I've just added the patch titled
x86/paravirt: Dont patch flush_tlb_single
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-paravirt-dont-patch-flush_tlb_single.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Jan 3 20:37:21 CET 2018
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Mon, 4 Dec 2017 15:07:30 +0100
Subject: x86/paravirt: Dont patch flush_tlb_single
From: Thomas Gleixner <tglx(a)linutronix.de>
commit a035795499ca1c2bd1928808d1a156eda1420383 upstream
native_flush_tlb_single() will be changed with the upcoming
PAGE_TABLE_ISOLATION feature. This requires to have more code in
there than INVLPG.
Remove the paravirt patching for it.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe(a)redhat.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Acked-by: Peter Zijlstra <peterz(a)infradead.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Cc: linux-mm(a)kvack.org
Cc: michael.schwarz(a)iaik.tugraz.at
Cc: moritz.lipp(a)iaik.tugraz.at
Cc: richard.fellner(a)student.tugraz.at
Link: https://lkml.kernel.org/r/20171204150606.828111617@linutronix.de
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Acked-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/paravirt_patch_64.c | 2 --
1 file changed, 2 deletions(-)
--- a/arch/x86/kernel/paravirt_patch_64.c
+++ b/arch/x86/kernel/paravirt_patch_64.c
@@ -9,7 +9,6 @@ DEF_NATIVE(pv_irq_ops, save_fl, "pushfq;
DEF_NATIVE(pv_mmu_ops, read_cr2, "movq %cr2, %rax");
DEF_NATIVE(pv_mmu_ops, read_cr3, "movq %cr3, %rax");
DEF_NATIVE(pv_mmu_ops, write_cr3, "movq %rdi, %cr3");
-DEF_NATIVE(pv_mmu_ops, flush_tlb_single, "invlpg (%rdi)");
DEF_NATIVE(pv_cpu_ops, clts, "clts");
DEF_NATIVE(pv_cpu_ops, wbinvd, "wbinvd");
@@ -59,7 +58,6 @@ unsigned native_patch(u8 type, u16 clobb
PATCH_SITE(pv_mmu_ops, read_cr3);
PATCH_SITE(pv_mmu_ops, write_cr3);
PATCH_SITE(pv_cpu_ops, clts);
- PATCH_SITE(pv_mmu_ops, flush_tlb_single);
PATCH_SITE(pv_cpu_ops, wbinvd);
#if defined(CONFIG_PARAVIRT_SPINLOCKS)
case PARAVIRT_PATCH(pv_lock_ops.queued_spin_unlock):
Patches currently in stable-queue which might be from tglx(a)linutronix.de are
queue-4.9/x86-paravirt-dont-patch-flush_tlb_single.patch
queue-4.9/x86-boot-add-early-cmdline-parsing-for-options-with-arguments.patch
This is a note to let you know that I've just added the patch titled
x86/kaiser: Move feature detection up
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-kaiser-move-feature-detection-up.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Jan 3 20:37:21 CET 2018
From: Borislav Petkov <bp(a)suse.de>
Date: Mon, 25 Dec 2017 13:57:16 +0100
Subject: x86/kaiser: Move feature detection up
From: Borislav Petkov <bp(a)suse.de>
... before the first use of kaiser_enabled as otherwise funky
things happen:
about to get started...
(XEN) d0v0 Unhandled page fault fault/trap [#14, ec=0000]
(XEN) Pagetable walk from ffff88022a449090:
(XEN) L4[0x110] = 0000000229e0e067 0000000000001e0e
(XEN) L3[0x008] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S: fault at ffff82d08033fd08
entry.o#create_bounce_frame+0x135/0x14d
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.9.1_02-3.21 x86_64 debug=n Not tainted ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<ffffffff81007460>]
(XEN) RFLAGS: 0000000000000286 EM: 1 CONTEXT: pv guest (d0v0)
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/kaiser.h | 2 ++
arch/x86/kernel/setup.c | 7 +++++++
arch/x86/mm/kaiser.c | 2 --
3 files changed, 9 insertions(+), 2 deletions(-)
--- a/arch/x86/include/asm/kaiser.h
+++ b/arch/x86/include/asm/kaiser.h
@@ -96,8 +96,10 @@ DECLARE_PER_CPU(unsigned long, x86_cr3_p
extern char __per_cpu_user_mapped_start[], __per_cpu_user_mapped_end[];
extern int kaiser_enabled;
+extern void __init kaiser_check_boottime_disable(void);
#else
#define kaiser_enabled 0
+static inline void __init kaiser_check_boottime_disable(void) {}
#endif /* CONFIG_KAISER */
/*
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -114,6 +114,7 @@
#include <asm/microcode.h>
#include <asm/mmu_context.h>
#include <asm/kaslr.h>
+#include <asm/kaiser.h>
/*
* max_low_pfn_mapped: highest direct mapped pfn under 4GB
@@ -1019,6 +1020,12 @@ void __init setup_arch(char **cmdline_p)
*/
init_hypervisor_platform();
+ /*
+ * This needs to happen right after XENPV is set on xen and
+ * kaiser_enabled is checked below in cleanup_highmap().
+ */
+ kaiser_check_boottime_disable();
+
x86_init.resources.probe_roms();
/* after parse_early_param, so could debug it */
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -310,8 +310,6 @@ void __init kaiser_init(void)
{
int cpu;
- kaiser_check_boottime_disable();
-
if (!kaiser_enabled)
return;
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.9/x86-paravirt-dont-patch-flush_tlb_single.patch
queue-4.9/x86-kaiser-reenable-paravirt.patch
queue-4.9/x86-kaiser-rename-and-simplify-x86_feature_kaiser-handling.patch
queue-4.9/x86-kaiser-check-boottime-cmdline-params.patch
queue-4.9/x86-kaiser-move-feature-detection-up.patch
This is a note to let you know that I've just added the patch titled
x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-kaiser-rename-and-simplify-x86_feature_kaiser-handling.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Jan 3 20:37:21 CET 2018
From: Borislav Petkov <bp(a)suse.de>
Date: Tue, 2 Jan 2018 14:19:48 +0100
Subject: x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling
From: Borislav Petkov <bp(a)suse.de>
Concentrate it in arch/x86/mm/kaiser.c and use the upstream string "nopti".
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/kernel-parameters.txt | 2 +-
arch/x86/kernel/cpu/common.c | 18 ------------------
arch/x86/mm/kaiser.c | 20 +++++++++++++++++++-
3 files changed, 20 insertions(+), 20 deletions(-)
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2763,7 +2763,7 @@ bytes respectively. Such letter suffixes
nojitter [IA-64] Disables jitter checking for ITC timers.
- nokaiser [X86-64] Disable KAISER isolation of kernel from user.
+ nopti [X86-64] Disable KAISER isolation of kernel from user.
no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -179,20 +179,6 @@ static int __init x86_pcid_setup(char *s
return 1;
}
__setup("nopcid", x86_pcid_setup);
-
-static int __init x86_nokaiser_setup(char *s)
-{
- /* nokaiser doesn't accept parameters */
- if (s)
- return -EINVAL;
-#ifdef CONFIG_KAISER
- kaiser_enabled = 0;
- setup_clear_cpu_cap(X86_FEATURE_KAISER);
- pr_info("nokaiser: KAISER feature disabled\n");
-#endif
- return 0;
-}
-early_param("nokaiser", x86_nokaiser_setup);
#endif
static int __init x86_noinvpcid_setup(char *s)
@@ -813,10 +799,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
init_scattered_cpuid_features(c);
-#ifdef CONFIG_KAISER
- if (kaiser_enabled)
- set_cpu_cap(c, X86_FEATURE_KAISER);
-#endif
}
static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -274,8 +274,13 @@ void __init kaiser_init(void)
{
int cpu;
- if (!kaiser_enabled)
+ if (!kaiser_enabled) {
+ setup_clear_cpu_cap(X86_FEATURE_KAISER);
return;
+ }
+
+ setup_force_cpu_cap(X86_FEATURE_KAISER);
+
kaiser_init_all_pgds();
for_each_possible_cpu(cpu) {
@@ -418,3 +423,16 @@ void kaiser_flush_tlb_on_return_to_user(
X86_CR3_PCID_USER_FLUSH | KAISER_SHADOW_PGD_OFFSET);
}
EXPORT_SYMBOL(kaiser_flush_tlb_on_return_to_user);
+
+static int __init x86_nokaiser_setup(char *s)
+{
+ /* nopti doesn't accept parameters */
+ if (s)
+ return -EINVAL;
+
+ kaiser_enabled = 0;
+ pr_info("Kernel/User page tables isolation: disabled\n");
+
+ return 0;
+}
+early_param("nopti", x86_nokaiser_setup);
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.9/x86-paravirt-dont-patch-flush_tlb_single.patch
queue-4.9/x86-kaiser-reenable-paravirt.patch
queue-4.9/x86-kaiser-rename-and-simplify-x86_feature_kaiser-handling.patch
queue-4.9/x86-kaiser-check-boottime-cmdline-params.patch
queue-4.9/x86-kaiser-move-feature-detection-up.patch
This is a note to let you know that I've just added the patch titled
x86/kaiser: Reenable PARAVIRT
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-kaiser-reenable-paravirt.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Jan 3 20:37:21 CET 2018
From: Borislav Petkov <bp(a)suse.de>
Date: Tue, 2 Jan 2018 14:19:49 +0100
Subject: x86/kaiser: Reenable PARAVIRT
From: Borislav Petkov <bp(a)suse.de>
Now that the required bits have been addressed, reenable
PARAVIRT.
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
security/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -34,7 +34,7 @@ config SECURITY
config KAISER
bool "Remove the kernel mapping in user mode"
default y
- depends on X86_64 && SMP && !PARAVIRT
+ depends on X86_64 && SMP
help
This enforces a strict kernel and user space isolation, in order
to close hardware side channels on kernel address information.
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.9/x86-paravirt-dont-patch-flush_tlb_single.patch
queue-4.9/x86-kaiser-reenable-paravirt.patch
queue-4.9/x86-kaiser-rename-and-simplify-x86_feature_kaiser-handling.patch
queue-4.9/x86-kaiser-check-boottime-cmdline-params.patch
queue-4.9/x86-kaiser-move-feature-detection-up.patch
This is a note to let you know that I've just added the patch titled
kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kaiser-x86_cr3_pcid_noflush-and-x86_cr3_pcid_user.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Jan 3 20:37:21 CET 2018
From: Hugh Dickins <hughd(a)google.com>
Date: Sun, 27 Aug 2017 16:24:27 -0700
Subject: kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user
From: Hugh Dickins <hughd(a)google.com>
Mostly this commit is just unshouting X86_CR3_PCID_KERN_VAR and
X86_CR3_PCID_USER_VAR: we usually name variables in lower-case.
But why does x86_cr3_pcid_noflush need to be __aligned(PAGE_SIZE)?
Ah, it's a leftover from when kaiser_add_user_map() once complained
about mapping the same page twice. Make it __read_mostly instead.
(I'm a little uneasy about all the unrelated data which shares its
page getting user-mapped too, but that was so before, and not a big
deal: though we call it user-mapped, it's not mapped with _PAGE_USER.)
And there is a little change around the two calls to do_nmi().
Previously they set the NOFLUSH bit (if PCID supported) when
forcing to kernel context before do_nmi(); now they also have the
NOFLUSH bit set (if PCID supported) when restoring context after:
nothing done in do_nmi() should require a TLB to be flushed here.
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/entry_64.S | 8 ++++----
arch/x86/include/asm/kaiser.h | 11 +++++------
arch/x86/mm/kaiser.c | 13 +++++++------
3 files changed, 16 insertions(+), 16 deletions(-)
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1316,11 +1316,11 @@ ENTRY(nmi)
/* Unconditionally use kernel CR3 for do_nmi() */
/* %rax is saved above, so OK to clobber here */
movq %cr3, %rax
+ /* If PCID enabled, NOFLUSH now and NOFLUSH on return */
+ orq x86_cr3_pcid_noflush, %rax
pushq %rax
/* mask off "user" bit of pgd address and 12 PCID bits: */
andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax
- /* Add back kernel PCID and "no flush" bit */
- orq X86_CR3_PCID_KERN_VAR, %rax
movq %rax, %cr3
#endif
call do_nmi
@@ -1560,11 +1560,11 @@ end_repeat_nmi:
/* Unconditionally use kernel CR3 for do_nmi() */
/* %rax is saved above, so OK to clobber here */
movq %cr3, %rax
+ /* If PCID enabled, NOFLUSH now and NOFLUSH on return */
+ orq x86_cr3_pcid_noflush, %rax
pushq %rax
/* mask off "user" bit of pgd address and 12 PCID bits: */
andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax
- /* Add back kernel PCID and "no flush" bit */
- orq X86_CR3_PCID_KERN_VAR, %rax
movq %rax, %cr3
#endif
--- a/arch/x86/include/asm/kaiser.h
+++ b/arch/x86/include/asm/kaiser.h
@@ -25,7 +25,7 @@
.macro _SWITCH_TO_KERNEL_CR3 reg
movq %cr3, \reg
andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), \reg
-orq X86_CR3_PCID_KERN_VAR, \reg
+orq x86_cr3_pcid_noflush, \reg
movq \reg, %cr3
.endm
@@ -37,11 +37,10 @@ movq \reg, %cr3
* not enabled): so that the one register can update both memory and cr3.
*/
movq %cr3, \reg
-andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), \reg
-orq PER_CPU_VAR(X86_CR3_PCID_USER_VAR), \reg
+orq PER_CPU_VAR(x86_cr3_pcid_user), \reg
js 9f
/* FLUSH this time, reset to NOFLUSH for next time (if PCID enabled) */
-movb \regb, PER_CPU_VAR(X86_CR3_PCID_USER_VAR+7)
+movb \regb, PER_CPU_VAR(x86_cr3_pcid_user+7)
9:
movq \reg, %cr3
.endm
@@ -94,8 +93,8 @@ movq PER_CPU_VAR(unsafe_stack_register_b
*/
DECLARE_PER_CPU_USER_MAPPED(unsigned long, unsafe_stack_register_backup);
-extern unsigned long X86_CR3_PCID_KERN_VAR;
-DECLARE_PER_CPU(unsigned long, X86_CR3_PCID_USER_VAR);
+extern unsigned long x86_cr3_pcid_noflush;
+DECLARE_PER_CPU(unsigned long, x86_cr3_pcid_user);
extern char __per_cpu_user_mapped_start[], __per_cpu_user_mapped_end[];
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -28,8 +28,8 @@ DEFINE_PER_CPU_USER_MAPPED(unsigned long
* This is also handy because systems that do not support PCIDs
* just end up or'ing a 0 into their CR3, which does no harm.
*/
-__aligned(PAGE_SIZE) unsigned long X86_CR3_PCID_KERN_VAR;
-DEFINE_PER_CPU(unsigned long, X86_CR3_PCID_USER_VAR);
+unsigned long x86_cr3_pcid_noflush __read_mostly;
+DEFINE_PER_CPU(unsigned long, x86_cr3_pcid_user);
/*
* At runtime, the only things we map are some things for CPU
@@ -303,7 +303,8 @@ void __init kaiser_init(void)
sizeof(gate_desc) * NR_VECTORS,
__PAGE_KERNEL);
- kaiser_add_user_map_early(&X86_CR3_PCID_KERN_VAR, PAGE_SIZE,
+ kaiser_add_user_map_early(&x86_cr3_pcid_noflush,
+ sizeof(x86_cr3_pcid_noflush),
__PAGE_KERNEL);
}
@@ -381,8 +382,8 @@ void kaiser_setup_pcid(void)
* These variables are used by the entry/exit
* code to change PCID and pgd and TLB flushing.
*/
- X86_CR3_PCID_KERN_VAR = kern_cr3;
- this_cpu_write(X86_CR3_PCID_USER_VAR, user_cr3);
+ x86_cr3_pcid_noflush = kern_cr3;
+ this_cpu_write(x86_cr3_pcid_user, user_cr3);
}
/*
@@ -392,7 +393,7 @@ void kaiser_setup_pcid(void)
*/
void kaiser_flush_tlb_on_return_to_user(void)
{
- this_cpu_write(X86_CR3_PCID_USER_VAR,
+ this_cpu_write(x86_cr3_pcid_user,
X86_CR3_PCID_USER_FLUSH | KAISER_SHADOW_PGD_OFFSET);
}
EXPORT_SYMBOL(kaiser_flush_tlb_on_return_to_user);
Patches currently in stable-queue which might be from hughd(a)google.com are
queue-4.9/kaiser-vmstat-show-nr_kaisertable-as-nr_overhead.patch
queue-4.9/kaiser-add-nokaiser-boot-option-using-alternative.patch
queue-4.9/kaiser-fix-unlikely-error-in-alloc_ldt_struct.patch
queue-4.9/kaiser-kaiser_flush_tlb_on_return_to_user-check-pcid.patch
queue-4.9/x86-paravirt-dont-patch-flush_tlb_single.patch
queue-4.9/kaiser-merged-update.patch
queue-4.9/kaiser-delete-kaiser_real_switch-option.patch
queue-4.9/kaiser-kaiser_remove_mapping-move-along-the-pgd.patch
queue-4.9/kaiser-fix-perf-crashes.patch
queue-4.9/kaiser-drop-is_atomic-arg-to-kaiser_pagetable_walk.patch
queue-4.9/kaiser-load_new_mm_cr3-let-switch_user_cr3-flush-user.patch
queue-4.9/kaiser-enhanced-by-kernel-and-user-pcids.patch
queue-4.9/kaiser-x86_cr3_pcid_noflush-and-x86_cr3_pcid_user.patch
queue-4.9/kaiser-align-addition-to-x86-mm-makefile.patch
queue-4.9/kaiser-use-alternative-instead-of-x86_cr3_pcid_noflush.patch
queue-4.9/kaiser-stack-map-page_size-at-thread_size-page_size.patch
queue-4.9/kaiser-name-that-0x1000-kaiser_shadow_pgd_offset.patch
queue-4.9/kaiser-fix-regs-to-do_nmi-ifndef-config_kaiser.patch
queue-4.9/kaiser-do-not-set-_page_nx-on-pgd_none.patch
queue-4.9/kaiser-tidied-up-asm-kaiser.h-somewhat.patch
queue-4.9/kaiser-cleanups-while-trying-for-gold-link.patch
queue-4.9/kaiser-tidied-up-kaiser_add-remove_mapping-slightly.patch
queue-4.9/kaiser-fix-build-and-fixme-in-alloc_ldt_struct.patch
queue-4.9/kaiser-kernel-address-isolation.patch
queue-4.9/kaiser-enomem-if-kaiser_pagetable_walk-null.patch
queue-4.9/kaiser-asm-tlbflush.h-handle-nopge-at-lower-level.patch
queue-4.9/kaiser-paranoid_entry-pass-cr3-need-to-paranoid_exit.patch
queue-4.9/kaiser-kaiser-depends-on-smp.patch
queue-4.9/kaiser-pcid-0-for-kernel-and-128-for-user.patch
This is a note to let you know that I've just added the patch titled
x86/kaiser: Check boottime cmdline params
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-kaiser-check-boottime-cmdline-params.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Jan 3 20:37:21 CET 2018
From: Borislav Petkov <bp(a)suse.de>
Date: Tue, 2 Jan 2018 14:19:48 +0100
Subject: x86/kaiser: Check boottime cmdline params
From: Borislav Petkov <bp(a)suse.de>
AMD (and possibly other vendors) are not affected by the leak
KAISER is protecting against.
Keep the "nopti" for traditional reasons and add pti=<on|off|auto>
like upstream.
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/kernel-parameters.txt | 6 +++
arch/x86/mm/kaiser.c | 59 +++++++++++++++++++++++++-----------
2 files changed, 47 insertions(+), 18 deletions(-)
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3327,6 +3327,12 @@ bytes respectively. Such letter suffixes
pt. [PARIDE]
See Documentation/blockdev/paride.txt.
+ pti= [X86_64]
+ Control KAISER user/kernel address space isolation:
+ on - enable
+ off - disable
+ auto - default setting
+
pty.legacy_count=
[KNL] Number of legacy pty's. Overwrites compiled-in
default number.
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -15,6 +15,7 @@
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
#include <asm/desc.h>
+#include <asm/cmdline.h>
int kaiser_enabled __read_mostly = 1;
EXPORT_SYMBOL(kaiser_enabled); /* for inlined TLB flush functions */
@@ -263,6 +264,43 @@ static void __init kaiser_init_all_pgds(
WARN_ON(__ret); \
} while (0)
+void __init kaiser_check_boottime_disable(void)
+{
+ bool enable = true;
+ char arg[5];
+ int ret;
+
+ ret = cmdline_find_option(boot_command_line, "pti", arg, sizeof(arg));
+ if (ret > 0) {
+ if (!strncmp(arg, "on", 2))
+ goto enable;
+
+ if (!strncmp(arg, "off", 3))
+ goto disable;
+
+ if (!strncmp(arg, "auto", 4))
+ goto skip;
+ }
+
+ if (cmdline_find_option_bool(boot_command_line, "nopti"))
+ goto disable;
+
+skip:
+ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+ goto disable;
+
+enable:
+ if (enable)
+ setup_force_cpu_cap(X86_FEATURE_KAISER);
+
+ return;
+
+disable:
+ pr_info("Kernel/User page tables isolation: disabled\n");
+ kaiser_enabled = 0;
+ setup_clear_cpu_cap(X86_FEATURE_KAISER);
+}
+
/*
* If anything in here fails, we will likely die on one of the
* first kernel->user transitions and init will die. But, we
@@ -274,12 +312,10 @@ void __init kaiser_init(void)
{
int cpu;
- if (!kaiser_enabled) {
- setup_clear_cpu_cap(X86_FEATURE_KAISER);
- return;
- }
+ kaiser_check_boottime_disable();
- setup_force_cpu_cap(X86_FEATURE_KAISER);
+ if (!kaiser_enabled)
+ return;
kaiser_init_all_pgds();
@@ -423,16 +459,3 @@ void kaiser_flush_tlb_on_return_to_user(
X86_CR3_PCID_USER_FLUSH | KAISER_SHADOW_PGD_OFFSET);
}
EXPORT_SYMBOL(kaiser_flush_tlb_on_return_to_user);
-
-static int __init x86_nokaiser_setup(char *s)
-{
- /* nopti doesn't accept parameters */
- if (s)
- return -EINVAL;
-
- kaiser_enabled = 0;
- pr_info("Kernel/User page tables isolation: disabled\n");
-
- return 0;
-}
-early_param("nopti", x86_nokaiser_setup);
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.9/x86-paravirt-dont-patch-flush_tlb_single.patch
queue-4.9/x86-kaiser-reenable-paravirt.patch
queue-4.9/x86-kaiser-rename-and-simplify-x86_feature_kaiser-handling.patch
queue-4.9/x86-kaiser-check-boottime-cmdline-params.patch
queue-4.9/x86-kaiser-move-feature-detection-up.patch
This is a note to let you know that I've just added the patch titled
kaiser: vmstat show NR_KAISERTABLE as nr_overhead
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kaiser-vmstat-show-nr_kaisertable-as-nr_overhead.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Jan 3 20:37:21 CET 2018
From: Hugh Dickins <hughd(a)google.com>
Date: Sat, 9 Sep 2017 21:27:32 -0700
Subject: kaiser: vmstat show NR_KAISERTABLE as nr_overhead
From: Hugh Dickins <hughd(a)google.com>
The kaiser update made an interesting choice, never to free any shadow
page tables. Contention on global spinlock was worrying, particularly
with it held across page table scans when freeing. Something had to be
done: I was going to add refcounting; but simply never to free them is
an appealing choice, minimizing contention without complicating the code
(the more a page table is found already, the less the spinlock is used).
But leaking pages in this way is also a worry: can we get away with it?
At the very least, we need a count to show how bad it actually gets:
in principle, one might end up wasting about 1/256 of memory that way
(1/512 for when direct-mapped pages have to be user-mapped, plus 1/512
for when they are user-mapped from the vmalloc area on another occasion
(but we don't have vmalloc'ed stacks, so only large ldts are vmalloc'ed).
Add per-cpu stat NR_KAISERTABLE: including 256 at startup for the
shared pgd entries, and 1 for each intermediate page table added
thereafter for user-mapping - but leave out the 1 per mm, for its
shadow pgd, because that distracts from the monotonic increase.
Shown in /proc/vmstat as nr_overhead (0 if kaiser not enabled).
In practice, it doesn't look so bad so far: more like 1/12000 after
nine hours of gtests below; and movable pageblock segregation should
tend to cluster the kaiser tables into a subset of the address space
(if not, they will be bad for compaction too). But production may
tell a different story: keep an eye on this number, and bring back
lighter freeing if it gets out of control (maybe a shrinker).
["nr_overhead" should of course say "nr_kaisertable", if it needs
to stay; but for the moment we are being coy, preferring that when
Joe Blow notices a new line in his /proc/vmstat, he does not get
too curious about what this "kaiser" stuff might be.]
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/kaiser.c | 16 +++++++++++-----
include/linux/mmzone.h | 3 ++-
mm/vmstat.c | 1 +
3 files changed, 14 insertions(+), 6 deletions(-)
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -121,9 +121,11 @@ static pte_t *kaiser_pagetable_walk(unsi
if (!new_pmd_page)
return NULL;
spin_lock(&shadow_table_allocation_lock);
- if (pud_none(*pud))
+ if (pud_none(*pud)) {
set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page)));
- else
+ __inc_zone_page_state(virt_to_page((void *)
+ new_pmd_page), NR_KAISERTABLE);
+ } else
free_page(new_pmd_page);
spin_unlock(&shadow_table_allocation_lock);
}
@@ -139,9 +141,11 @@ static pte_t *kaiser_pagetable_walk(unsi
if (!new_pte_page)
return NULL;
spin_lock(&shadow_table_allocation_lock);
- if (pmd_none(*pmd))
+ if (pmd_none(*pmd)) {
set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page)));
- else
+ __inc_zone_page_state(virt_to_page((void *)
+ new_pte_page), NR_KAISERTABLE);
+ } else
free_page(new_pte_page);
spin_unlock(&shadow_table_allocation_lock);
}
@@ -205,11 +209,13 @@ static void __init kaiser_init_all_pgds(
pgd = native_get_shadow_pgd(pgd_offset_k((unsigned long )0));
for (i = PTRS_PER_PGD / 2; i < PTRS_PER_PGD; i++) {
pgd_t new_pgd;
- pud_t *pud = pud_alloc_one(&init_mm, PAGE_OFFSET + i * PGDIR_SIZE);
+ pud_t *pud = pud_alloc_one(&init_mm,
+ PAGE_OFFSET + i * PGDIR_SIZE);
if (!pud) {
WARN_ON(1);
break;
}
+ inc_zone_page_state(virt_to_page(pud), NR_KAISERTABLE);
new_pgd = __pgd(_KERNPG_TABLE |__pa(pud));
/*
* Make sure not to stomp on some other pgd entry.
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -124,8 +124,9 @@ enum zone_stat_item {
NR_SLAB_UNRECLAIMABLE,
NR_PAGETABLE, /* used for pagetables */
NR_KERNEL_STACK_KB, /* measured in KiB */
- /* Second 128 byte cacheline */
+ NR_KAISERTABLE,
NR_BOUNCE,
+ /* Second 128 byte cacheline */
#if IS_ENABLED(CONFIG_ZSMALLOC)
NR_ZSPAGES, /* allocated in zsmalloc */
#endif
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -932,6 +932,7 @@ const char * const vmstat_text[] = {
"nr_slab_unreclaimable",
"nr_page_table_pages",
"nr_kernel_stack",
+ "nr_overhead",
"nr_bounce",
#if IS_ENABLED(CONFIG_ZSMALLOC)
"nr_zspages",
Patches currently in stable-queue which might be from hughd(a)google.com are
queue-4.9/kaiser-vmstat-show-nr_kaisertable-as-nr_overhead.patch
queue-4.9/kaiser-add-nokaiser-boot-option-using-alternative.patch
queue-4.9/kaiser-fix-unlikely-error-in-alloc_ldt_struct.patch
queue-4.9/kaiser-kaiser_flush_tlb_on_return_to_user-check-pcid.patch
queue-4.9/x86-paravirt-dont-patch-flush_tlb_single.patch
queue-4.9/kaiser-merged-update.patch
queue-4.9/kaiser-delete-kaiser_real_switch-option.patch
queue-4.9/kaiser-kaiser_remove_mapping-move-along-the-pgd.patch
queue-4.9/kaiser-fix-perf-crashes.patch
queue-4.9/kaiser-drop-is_atomic-arg-to-kaiser_pagetable_walk.patch
queue-4.9/kaiser-load_new_mm_cr3-let-switch_user_cr3-flush-user.patch
queue-4.9/kaiser-enhanced-by-kernel-and-user-pcids.patch
queue-4.9/kaiser-x86_cr3_pcid_noflush-and-x86_cr3_pcid_user.patch
queue-4.9/kaiser-align-addition-to-x86-mm-makefile.patch
queue-4.9/kaiser-use-alternative-instead-of-x86_cr3_pcid_noflush.patch
queue-4.9/kaiser-stack-map-page_size-at-thread_size-page_size.patch
queue-4.9/kaiser-name-that-0x1000-kaiser_shadow_pgd_offset.patch
queue-4.9/kaiser-fix-regs-to-do_nmi-ifndef-config_kaiser.patch
queue-4.9/kaiser-do-not-set-_page_nx-on-pgd_none.patch
queue-4.9/kaiser-tidied-up-asm-kaiser.h-somewhat.patch
queue-4.9/kaiser-cleanups-while-trying-for-gold-link.patch
queue-4.9/kaiser-tidied-up-kaiser_add-remove_mapping-slightly.patch
queue-4.9/kaiser-fix-build-and-fixme-in-alloc_ldt_struct.patch
queue-4.9/kaiser-kernel-address-isolation.patch
queue-4.9/kaiser-enomem-if-kaiser_pagetable_walk-null.patch
queue-4.9/kaiser-asm-tlbflush.h-handle-nopge-at-lower-level.patch
queue-4.9/kaiser-paranoid_entry-pass-cr3-need-to-paranoid_exit.patch
queue-4.9/kaiser-kaiser-depends-on-smp.patch
queue-4.9/kaiser-pcid-0-for-kernel-and-128-for-user.patch
This is a note to let you know that I've just added the patch titled
kaiser: tidied up kaiser_add/remove_mapping slightly
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kaiser-tidied-up-kaiser_add-remove_mapping-slightly.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Jan 3 20:37:21 CET 2018
From: Hugh Dickins <hughd(a)google.com>
Date: Sun, 3 Sep 2017 19:23:08 -0700
Subject: kaiser: tidied up kaiser_add/remove_mapping slightly
From: Hugh Dickins <hughd(a)google.com>
Yes, unmap_pud_range_nofree()'s declaration ought to be in a
header file really, but I'm not sure we want to use it anyway:
so for now just declare it inside kaiser_remove_mapping().
And there doesn't seem to be such a thing as unmap_p4d_range(),
even in a 5-level paging tree.
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/kaiser.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -285,8 +285,7 @@ void __init kaiser_init(void)
__PAGE_KERNEL);
}
-extern void unmap_pud_range_nofree(pgd_t *pgd, unsigned long start, unsigned long end);
-// add a mapping to the shadow-mapping, and synchronize the mappings
+/* Add a mapping to the shadow mapping, and synchronize the mappings */
int kaiser_add_mapping(unsigned long addr, unsigned long size, unsigned long flags)
{
return kaiser_add_user_map((const void *)addr, size, flags);
@@ -294,15 +293,13 @@ int kaiser_add_mapping(unsigned long add
void kaiser_remove_mapping(unsigned long start, unsigned long size)
{
+ extern void unmap_pud_range_nofree(pgd_t *pgd,
+ unsigned long start, unsigned long end);
unsigned long end = start + size;
unsigned long addr;
for (addr = start; addr < end; addr += PGDIR_SIZE) {
pgd_t *pgd = native_get_shadow_pgd(pgd_offset_k(addr));
- /*
- * unmap_p4d_range() handles > P4D_SIZE unmaps,
- * so no need to trim 'end'.
- */
unmap_pud_range_nofree(pgd, addr, end);
}
}
Patches currently in stable-queue which might be from hughd(a)google.com are
queue-4.9/kaiser-vmstat-show-nr_kaisertable-as-nr_overhead.patch
queue-4.9/kaiser-add-nokaiser-boot-option-using-alternative.patch
queue-4.9/kaiser-fix-unlikely-error-in-alloc_ldt_struct.patch
queue-4.9/kaiser-kaiser_flush_tlb_on_return_to_user-check-pcid.patch
queue-4.9/x86-paravirt-dont-patch-flush_tlb_single.patch
queue-4.9/kaiser-merged-update.patch
queue-4.9/kaiser-delete-kaiser_real_switch-option.patch
queue-4.9/kaiser-kaiser_remove_mapping-move-along-the-pgd.patch
queue-4.9/kaiser-fix-perf-crashes.patch
queue-4.9/kaiser-drop-is_atomic-arg-to-kaiser_pagetable_walk.patch
queue-4.9/kaiser-load_new_mm_cr3-let-switch_user_cr3-flush-user.patch
queue-4.9/kaiser-enhanced-by-kernel-and-user-pcids.patch
queue-4.9/kaiser-x86_cr3_pcid_noflush-and-x86_cr3_pcid_user.patch
queue-4.9/kaiser-align-addition-to-x86-mm-makefile.patch
queue-4.9/kaiser-use-alternative-instead-of-x86_cr3_pcid_noflush.patch
queue-4.9/kaiser-stack-map-page_size-at-thread_size-page_size.patch
queue-4.9/kaiser-name-that-0x1000-kaiser_shadow_pgd_offset.patch
queue-4.9/kaiser-fix-regs-to-do_nmi-ifndef-config_kaiser.patch
queue-4.9/kaiser-do-not-set-_page_nx-on-pgd_none.patch
queue-4.9/kaiser-tidied-up-asm-kaiser.h-somewhat.patch
queue-4.9/kaiser-cleanups-while-trying-for-gold-link.patch
queue-4.9/kaiser-tidied-up-kaiser_add-remove_mapping-slightly.patch
queue-4.9/kaiser-fix-build-and-fixme-in-alloc_ldt_struct.patch
queue-4.9/kaiser-kernel-address-isolation.patch
queue-4.9/kaiser-enomem-if-kaiser_pagetable_walk-null.patch
queue-4.9/kaiser-asm-tlbflush.h-handle-nopge-at-lower-level.patch
queue-4.9/kaiser-paranoid_entry-pass-cr3-need-to-paranoid_exit.patch
queue-4.9/kaiser-kaiser-depends-on-smp.patch
queue-4.9/kaiser-pcid-0-for-kernel-and-128-for-user.patch
This is a note to let you know that I've just added the patch titled
kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kaiser-use-alternative-instead-of-x86_cr3_pcid_noflush.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Jan 3 20:37:21 CET 2018
From: Hugh Dickins <hughd(a)google.com>
Date: Tue, 3 Oct 2017 20:49:04 -0700
Subject: kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush
From: Hugh Dickins <hughd(a)google.com>
Now that we're playing the ALTERNATIVE game, use that more efficient
method: instead of user-mapping an extra page, and reading an extra
cacheline each time for x86_cr3_pcid_noflush.
Neel has found that __stringify(bts $X86_CR3_PCID_NOFLUSH_BIT, %rax)
is a working substitute for the "bts $63, %rax" in these ALTERNATIVEs;
but the one line with $63 in looks clearer, so let's stick with that.
Worried about what happens with an ALTERNATIVE between the jump and
jump label in another ALTERNATIVE? I was, but have checked the
combinations in SWITCH_KERNEL_CR3_NO_STACK at entry_SYSCALL_64,
and it does a good job.
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Jiri Kosina <jkosina(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/entry_64.S | 7 ++++---
arch/x86/include/asm/kaiser.h | 6 +++---
arch/x86/mm/kaiser.c | 11 +----------
3 files changed, 8 insertions(+), 16 deletions(-)
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1084,7 +1084,8 @@ ENTRY(paranoid_entry)
jz 2f
orl $2, %ebx
andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax
- orq x86_cr3_pcid_noflush, %rax
+ /* If PCID enabled, set X86_CR3_PCID_NOFLUSH_BIT */
+ ALTERNATIVE "", "bts $63, %rax", X86_FEATURE_PCID
movq %rax, %cr3
2:
#endif
@@ -1344,7 +1345,7 @@ ENTRY(nmi)
/* %rax is saved above, so OK to clobber here */
ALTERNATIVE "jmp 2f", "movq %cr3, %rax", X86_FEATURE_KAISER
/* If PCID enabled, NOFLUSH now and NOFLUSH on return */
- orq x86_cr3_pcid_noflush, %rax
+ ALTERNATIVE "", "bts $63, %rax", X86_FEATURE_PCID
pushq %rax
/* mask off "user" bit of pgd address and 12 PCID bits: */
andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax
@@ -1588,7 +1589,7 @@ end_repeat_nmi:
/* %rax is saved above, so OK to clobber here */
ALTERNATIVE "jmp 2f", "movq %cr3, %rax", X86_FEATURE_KAISER
/* If PCID enabled, NOFLUSH now and NOFLUSH on return */
- orq x86_cr3_pcid_noflush, %rax
+ ALTERNATIVE "", "bts $63, %rax", X86_FEATURE_PCID
pushq %rax
/* mask off "user" bit of pgd address and 12 PCID bits: */
andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax
--- a/arch/x86/include/asm/kaiser.h
+++ b/arch/x86/include/asm/kaiser.h
@@ -25,7 +25,8 @@
.macro _SWITCH_TO_KERNEL_CR3 reg
movq %cr3, \reg
andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), \reg
-orq x86_cr3_pcid_noflush, \reg
+/* If PCID enabled, set X86_CR3_PCID_NOFLUSH_BIT */
+ALTERNATIVE "", "bts $63, \reg", X86_FEATURE_PCID
movq \reg, %cr3
.endm
@@ -39,7 +40,7 @@ movq \reg, %cr3
movq %cr3, \reg
orq PER_CPU_VAR(x86_cr3_pcid_user), \reg
js 9f
-/* FLUSH this time, reset to NOFLUSH for next time (if PCID enabled) */
+/* If PCID enabled, FLUSH this time, reset to NOFLUSH for next time */
movb \regb, PER_CPU_VAR(x86_cr3_pcid_user+7)
9:
movq \reg, %cr3
@@ -90,7 +91,6 @@ movq PER_CPU_VAR(unsafe_stack_register_b
*/
DECLARE_PER_CPU_USER_MAPPED(unsigned long, unsafe_stack_register_backup);
-extern unsigned long x86_cr3_pcid_noflush;
DECLARE_PER_CPU(unsigned long, x86_cr3_pcid_user);
extern char __per_cpu_user_mapped_start[], __per_cpu_user_mapped_end[];
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -31,7 +31,6 @@ DEFINE_PER_CPU_USER_MAPPED(unsigned long
* This is also handy because systems that do not support PCIDs
* just end up or'ing a 0 into their CR3, which does no harm.
*/
-unsigned long x86_cr3_pcid_noflush __read_mostly;
DEFINE_PER_CPU(unsigned long, x86_cr3_pcid_user);
/*
@@ -356,10 +355,6 @@ void __init kaiser_init(void)
kaiser_add_user_map_early(&debug_idt_table,
sizeof(gate_desc) * NR_VECTORS,
__PAGE_KERNEL);
-
- kaiser_add_user_map_early(&x86_cr3_pcid_noflush,
- sizeof(x86_cr3_pcid_noflush),
- __PAGE_KERNEL);
}
/* Add a mapping to the shadow mapping, and synchronize the mappings */
@@ -433,18 +428,14 @@ pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp,
void kaiser_setup_pcid(void)
{
- unsigned long kern_cr3 = 0;
unsigned long user_cr3 = KAISER_SHADOW_PGD_OFFSET;
- if (this_cpu_has(X86_FEATURE_PCID)) {
- kern_cr3 |= X86_CR3_PCID_KERN_NOFLUSH;
+ if (this_cpu_has(X86_FEATURE_PCID))
user_cr3 |= X86_CR3_PCID_USER_NOFLUSH;
- }
/*
* These variables are used by the entry/exit
* code to change PCID and pgd and TLB flushing.
*/
- x86_cr3_pcid_noflush = kern_cr3;
this_cpu_write(x86_cr3_pcid_user, user_cr3);
}
Patches currently in stable-queue which might be from hughd(a)google.com are
queue-4.9/kaiser-vmstat-show-nr_kaisertable-as-nr_overhead.patch
queue-4.9/kaiser-add-nokaiser-boot-option-using-alternative.patch
queue-4.9/kaiser-fix-unlikely-error-in-alloc_ldt_struct.patch
queue-4.9/kaiser-kaiser_flush_tlb_on_return_to_user-check-pcid.patch
queue-4.9/x86-paravirt-dont-patch-flush_tlb_single.patch
queue-4.9/kaiser-merged-update.patch
queue-4.9/kaiser-delete-kaiser_real_switch-option.patch
queue-4.9/kaiser-kaiser_remove_mapping-move-along-the-pgd.patch
queue-4.9/kaiser-fix-perf-crashes.patch
queue-4.9/kaiser-drop-is_atomic-arg-to-kaiser_pagetable_walk.patch
queue-4.9/kaiser-load_new_mm_cr3-let-switch_user_cr3-flush-user.patch
queue-4.9/kaiser-enhanced-by-kernel-and-user-pcids.patch
queue-4.9/kaiser-x86_cr3_pcid_noflush-and-x86_cr3_pcid_user.patch
queue-4.9/kaiser-align-addition-to-x86-mm-makefile.patch
queue-4.9/kaiser-use-alternative-instead-of-x86_cr3_pcid_noflush.patch
queue-4.9/kaiser-stack-map-page_size-at-thread_size-page_size.patch
queue-4.9/kaiser-name-that-0x1000-kaiser_shadow_pgd_offset.patch
queue-4.9/kaiser-fix-regs-to-do_nmi-ifndef-config_kaiser.patch
queue-4.9/kaiser-do-not-set-_page_nx-on-pgd_none.patch
queue-4.9/kaiser-tidied-up-asm-kaiser.h-somewhat.patch
queue-4.9/kaiser-cleanups-while-trying-for-gold-link.patch
queue-4.9/kaiser-tidied-up-kaiser_add-remove_mapping-slightly.patch
queue-4.9/kaiser-fix-build-and-fixme-in-alloc_ldt_struct.patch
queue-4.9/kaiser-kernel-address-isolation.patch
queue-4.9/kaiser-enomem-if-kaiser_pagetable_walk-null.patch
queue-4.9/kaiser-asm-tlbflush.h-handle-nopge-at-lower-level.patch
queue-4.9/kaiser-paranoid_entry-pass-cr3-need-to-paranoid_exit.patch
queue-4.9/kaiser-kaiser-depends-on-smp.patch
queue-4.9/kaiser-pcid-0-for-kernel-and-128-for-user.patch
This is a note to let you know that I've just added the patch titled
kaiser: PCID 0 for kernel and 128 for user
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kaiser-pcid-0-for-kernel-and-128-for-user.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Jan 3 20:37:21 CET 2018
From: Hugh Dickins <hughd(a)google.com>
Date: Fri, 8 Sep 2017 19:26:30 -0700
Subject: kaiser: PCID 0 for kernel and 128 for user
From: Hugh Dickins <hughd(a)google.com>
Why was 4 chosen for kernel PCID and 6 for user PCID?
No good reason in a backport where PCIDs are only used for Kaiser.
If we continue with those, then we shall need to add Andy Lutomirski's
4.13 commit 6c690ee1039b ("x86/mm: Split read_cr3() into read_cr3_pa()
and __read_cr3()"), which deals with the problem of read_cr3() callers
finding stray bits in the cr3 that they expected to be page-aligned;
and for hibernation, his 4.14 commit f34902c5c6c0 ("x86/hibernate/64:
Mask off CR3's PCID bits in the saved CR3").
But if 0 is used for kernel PCID, then there's no need to add in those
commits - whenever the kernel looks, it sees 0 in the lower bits; and
0 for kernel seems an obvious choice.
And I naughtily propose 128 for user PCID. Because there's a place
in _SWITCH_TO_USER_CR3 where it takes note of the need for TLB FLUSH,
but needs to reset that to NOFLUSH for the next occasion. Currently
it does so with a "movb $(0x80)" into the high byte of the per-cpu
quadword, but that will cause a machine without PCID support to crash.
Now, if %al just happened to have 0x80 in it at that point, on a
machine with PCID support, but 0 on a machine without PCID support...
(That will go badly wrong once the pgd can be at a physical address
above 2^56, but even with 5-level paging, physical goes up to 2^52.)
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/kaiser.h | 19 ++++++++++++-------
arch/x86/include/asm/pgtable_types.h | 7 ++++---
arch/x86/mm/tlb.c | 3 +++
3 files changed, 19 insertions(+), 10 deletions(-)
--- a/arch/x86/include/asm/kaiser.h
+++ b/arch/x86/include/asm/kaiser.h
@@ -29,14 +29,19 @@ orq X86_CR3_PCID_KERN_VAR, \reg
movq \reg, %cr3
.endm
-.macro _SWITCH_TO_USER_CR3 reg
+.macro _SWITCH_TO_USER_CR3 reg regb
+/*
+ * regb must be the low byte portion of reg: because we have arranged
+ * for the low byte of the user PCID to serve as the high byte of NOFLUSH
+ * (0x80 for each when PCID is enabled, or 0x00 when PCID and NOFLUSH are
+ * not enabled): so that the one register can update both memory and cr3.
+ */
movq %cr3, \reg
andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), \reg
orq PER_CPU_VAR(X86_CR3_PCID_USER_VAR), \reg
js 9f
-// FLUSH this time, reset to NOFLUSH for next time
-// But if nopcid? Consider using 0x80 for user pcid?
-movb $(0x80), PER_CPU_VAR(X86_CR3_PCID_USER_VAR+7)
+/* FLUSH this time, reset to NOFLUSH for next time (if PCID enabled) */
+movb \regb, PER_CPU_VAR(X86_CR3_PCID_USER_VAR+7)
9:
movq \reg, %cr3
.endm
@@ -49,7 +54,7 @@ popq %rax
.macro SWITCH_USER_CR3
pushq %rax
-_SWITCH_TO_USER_CR3 %rax
+_SWITCH_TO_USER_CR3 %rax %al
popq %rax
.endm
@@ -61,7 +66,7 @@ movq PER_CPU_VAR(unsafe_stack_register_b
.macro SWITCH_USER_CR3_NO_STACK
movq %rax, PER_CPU_VAR(unsafe_stack_register_backup)
-_SWITCH_TO_USER_CR3 %rax
+_SWITCH_TO_USER_CR3 %rax %al
movq PER_CPU_VAR(unsafe_stack_register_backup), %rax
.endm
@@ -69,7 +74,7 @@ movq PER_CPU_VAR(unsafe_stack_register_b
.macro SWITCH_KERNEL_CR3 reg
.endm
-.macro SWITCH_USER_CR3 reg
+.macro SWITCH_USER_CR3 reg regb
.endm
.macro SWITCH_USER_CR3_NO_STACK
.endm
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -146,16 +146,17 @@
/* Mask for all the PCID-related bits in CR3: */
#define X86_CR3_PCID_MASK (X86_CR3_PCID_NOFLUSH | X86_CR3_PCID_ASID_MASK)
+#define X86_CR3_PCID_ASID_KERN (_AC(0x0,UL))
+
#if defined(CONFIG_KAISER) && defined(CONFIG_X86_64)
-#define X86_CR3_PCID_ASID_KERN (_AC(0x4,UL))
-#define X86_CR3_PCID_ASID_USER (_AC(0x6,UL))
+/* Let X86_CR3_PCID_ASID_USER be usable for the X86_CR3_PCID_NOFLUSH bit */
+#define X86_CR3_PCID_ASID_USER (_AC(0x80,UL))
#define X86_CR3_PCID_KERN_FLUSH (X86_CR3_PCID_ASID_KERN)
#define X86_CR3_PCID_USER_FLUSH (X86_CR3_PCID_ASID_USER)
#define X86_CR3_PCID_KERN_NOFLUSH (X86_CR3_PCID_NOFLUSH | X86_CR3_PCID_ASID_KERN)
#define X86_CR3_PCID_USER_NOFLUSH (X86_CR3_PCID_NOFLUSH | X86_CR3_PCID_ASID_USER)
#else
-#define X86_CR3_PCID_ASID_KERN (_AC(0x0,UL))
#define X86_CR3_PCID_ASID_USER (_AC(0x0,UL))
/*
* PCIDs are unsupported on 32-bit and none of these bits can be
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -50,6 +50,9 @@ static void load_new_mm_cr3(pgd_t *pgdir
* invpcid_flush_single_context(X86_CR3_PCID_ASID_USER) could
* do it here, but can only be used if X86_FEATURE_INVPCID is
* available - and many machines support pcid without invpcid.
+ *
+ * The line below is a no-op: X86_CR3_PCID_KERN_FLUSH is now 0;
+ * but keep that line in there in case something changes.
*/
new_mm_cr3 |= X86_CR3_PCID_KERN_FLUSH;
kaiser_flush_tlb_on_return_to_user();
Patches currently in stable-queue which might be from hughd(a)google.com are
queue-4.9/kaiser-vmstat-show-nr_kaisertable-as-nr_overhead.patch
queue-4.9/kaiser-add-nokaiser-boot-option-using-alternative.patch
queue-4.9/kaiser-fix-unlikely-error-in-alloc_ldt_struct.patch
queue-4.9/kaiser-kaiser_flush_tlb_on_return_to_user-check-pcid.patch
queue-4.9/x86-paravirt-dont-patch-flush_tlb_single.patch
queue-4.9/kaiser-merged-update.patch
queue-4.9/kaiser-delete-kaiser_real_switch-option.patch
queue-4.9/kaiser-kaiser_remove_mapping-move-along-the-pgd.patch
queue-4.9/kaiser-fix-perf-crashes.patch
queue-4.9/kaiser-drop-is_atomic-arg-to-kaiser_pagetable_walk.patch
queue-4.9/kaiser-load_new_mm_cr3-let-switch_user_cr3-flush-user.patch
queue-4.9/kaiser-enhanced-by-kernel-and-user-pcids.patch
queue-4.9/kaiser-x86_cr3_pcid_noflush-and-x86_cr3_pcid_user.patch
queue-4.9/kaiser-align-addition-to-x86-mm-makefile.patch
queue-4.9/kaiser-use-alternative-instead-of-x86_cr3_pcid_noflush.patch
queue-4.9/kaiser-stack-map-page_size-at-thread_size-page_size.patch
queue-4.9/kaiser-name-that-0x1000-kaiser_shadow_pgd_offset.patch
queue-4.9/kaiser-fix-regs-to-do_nmi-ifndef-config_kaiser.patch
queue-4.9/kaiser-do-not-set-_page_nx-on-pgd_none.patch
queue-4.9/kaiser-tidied-up-asm-kaiser.h-somewhat.patch
queue-4.9/kaiser-cleanups-while-trying-for-gold-link.patch
queue-4.9/kaiser-tidied-up-kaiser_add-remove_mapping-slightly.patch
queue-4.9/kaiser-fix-build-and-fixme-in-alloc_ldt_struct.patch
queue-4.9/kaiser-kernel-address-isolation.patch
queue-4.9/kaiser-enomem-if-kaiser_pagetable_walk-null.patch
queue-4.9/kaiser-asm-tlbflush.h-handle-nopge-at-lower-level.patch
queue-4.9/kaiser-paranoid_entry-pass-cr3-need-to-paranoid_exit.patch
queue-4.9/kaiser-kaiser-depends-on-smp.patch
queue-4.9/kaiser-pcid-0-for-kernel-and-128-for-user.patch