Hi Greg,
This patch series is a backport of the Spectre-v2 fixes (IBPB/IBRS) and patches for the Speculative Store Bypass vulnerability to 4.4.y (they apply cleanly on top of 4.4.140).
I used 4.9.y as my reference when backporting to 4.4.y (as I thought that would minimize the amount of fixing up necessary). Unfortunately I had to skip the KVM fixes for these vulnerabilities, as the KVM codebase is drastically different in 4.4 as compared to 4.9. (I tried my best to backport them initially, but wasn't confident that they were correct, so I decided to drop them from this series).
You'll notice that the initial few patches in this series include cleanups etc., that are non-critical to IBPB/IBRS/SSBD. Most of these patches are aimed at getting the cpufeature.h vs cpufeatures.h split into 4.4, since a lot of the subsequent patches update these headers. On my first attempt to backport these patches to 4.4.y, I had actually tried to do all the updates on the cpufeature.h file itself, but it started getting very cumbersome, so I resorted to backporting the cpufeature.h vs cpufeatures.h split and their dependencies as well. I think apart from these initial patches, the rest of the patchset doesn't have all that much noise.
This patchset has been tested on both Intel and AMD machines (Intel Xeon CPU E5-2660 v4 and AMD EPYC 7281 16-Core Processor, respectively) with updated microcode. All the patch backports have been independently reviewed by Matt Helsley, Alexey Makhalov and Bo Gan.
I would appreciate if you could kindly consider these patches for review and inclusion in a future 4.4.y release.
Thank you very much!
Regards, Srivatsa VMware Photon OS
P.S. This patchset is also available in the following repo if anyone is interested in giving it a try:
https://github.com/srivatsabhat/linux-stable spectre-v2-fixes-nokvm-4.4.140
From: Borislav Petkov bp@suse.de
commit 2ccd71f1b278d450a6f8c8c737c7fe237ca06dc6 upstream
Turn the CPUID leafs which are proper CPUID feature bit leafs into separate ->x86_capability words.
Signed-off-by: Borislav Petkov bp@suse.de Link: http://lkml.kernel.org/r/1449481182-27541-2-git-send-email-bp@alien8.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 54 ++++++++++++++++++++++--------------- arch/x86/kernel/cpu/common.c | 5 +++ arch/x86/kernel/cpu/scattered.c | 20 -------------- 3 files changed, 37 insertions(+), 42 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 232621c..999e166 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -12,7 +12,7 @@ #include <asm/disabled-features.h> #endif
-#define NCAPINTS 14 /* N 32-bit words worth of info */ +#define NCAPINTS 16 /* N 32-bit words worth of info */ #define NBUGINTS 1 /* N 32-bit bug flags */
/* @@ -181,23 +181,18 @@
/* * Auxiliary flags: Linux defined - For features scattered in various - * CPUID levels like 0x6, 0xA etc, word 7 + * CPUID levels like 0x6, 0xA etc, word 7. + * + * Reuse free bits when adding new feature flags! */ -#define X86_FEATURE_IDA ( 7*32+ 0) /* Intel Dynamic Acceleration */ -#define X86_FEATURE_ARAT ( 7*32+ 1) /* Always Running APIC Timer */ + #define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */ #define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */ #define X86_FEATURE_INVPCID_SINGLE ( 7*32+ 4) /* Effectively INVPCID && CR4.PCIDE=1 */ -#define X86_FEATURE_PLN ( 7*32+ 5) /* Intel Power Limit Notification */ -#define X86_FEATURE_PTS ( 7*32+ 6) /* Intel Package Thermal Status */ -#define X86_FEATURE_DTHERM ( 7*32+ 7) /* Digital Thermal Sensor */ + #define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */ #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */ -#define X86_FEATURE_HWP ( 7*32+ 10) /* "hwp" Intel HWP */ -#define X86_FEATURE_HWP_NOTIFY ( 7*32+ 11) /* Intel HWP_NOTIFY */ -#define X86_FEATURE_HWP_ACT_WINDOW ( 7*32+ 12) /* Intel HWP_ACT_WINDOW */ -#define X86_FEATURE_HWP_EPP ( 7*32+13) /* Intel HWP_EPP */ -#define X86_FEATURE_HWP_PKG_REQ ( 7*32+14) /* Intel HWP_PKG_REQ */ + #define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */ #define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* Fill RSB on context switches */
@@ -212,16 +207,7 @@ #define X86_FEATURE_FLEXPRIORITY ( 8*32+ 2) /* Intel FlexPriority */ #define X86_FEATURE_EPT ( 8*32+ 3) /* Intel Extended Page Table */ #define X86_FEATURE_VPID ( 8*32+ 4) /* Intel Virtual Processor ID */ -#define X86_FEATURE_NPT ( 8*32+ 5) /* AMD Nested Page Table support */ -#define X86_FEATURE_LBRV ( 8*32+ 6) /* AMD LBR Virtualization support */ -#define X86_FEATURE_SVML ( 8*32+ 7) /* "svm_lock" AMD SVM locking MSR */ -#define X86_FEATURE_NRIPS ( 8*32+ 8) /* "nrip_save" AMD SVM next_rip save */ -#define X86_FEATURE_TSCRATEMSR ( 8*32+ 9) /* "tsc_scale" AMD TSC scaling support */ -#define X86_FEATURE_VMCBCLEAN ( 8*32+10) /* "vmcb_clean" AMD VMCB clean bits support */ -#define X86_FEATURE_FLUSHBYASID ( 8*32+11) /* AMD flush-by-ASID support */ -#define X86_FEATURE_DECODEASSISTS ( 8*32+12) /* AMD Decode Assists support */ -#define X86_FEATURE_PAUSEFILTER ( 8*32+13) /* AMD filtered pause intercept */ -#define X86_FEATURE_PFTHRESHOLD ( 8*32+14) /* AMD pause filter threshold */ + #define X86_FEATURE_VMMCALL ( 8*32+15) /* Prefer vmmcall to vmcall */ #define X86_FEATURE_XENPV ( 8*32+16) /* "" Xen paravirtual guest */
@@ -266,6 +252,30 @@ /* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */ #define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */
+/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */ +#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */ +#define X86_FEATURE_IDA (14*32+ 1) /* Intel Dynamic Acceleration */ +#define X86_FEATURE_ARAT (14*32+ 2) /* Always Running APIC Timer */ +#define X86_FEATURE_PLN (14*32+ 4) /* Intel Power Limit Notification */ +#define X86_FEATURE_PTS (14*32+ 6) /* Intel Package Thermal Status */ +#define X86_FEATURE_HWP (14*32+ 7) /* Intel Hardware P-states */ +#define X86_FEATURE_HWP_NOTIFY (14*32+ 8) /* HWP Notification */ +#define X86_FEATURE_HWP_ACT_WINDOW (14*32+ 9) /* HWP Activity Window */ +#define X86_FEATURE_HWP_EPP (14*32+10) /* HWP Energy Perf. Preference */ +#define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */ + +/* AMD SVM Feature Identification, CPUID level 0x8000000a (edx), word 15 */ +#define X86_FEATURE_NPT (15*32+ 0) /* Nested Page Table support */ +#define X86_FEATURE_LBRV (15*32+ 1) /* LBR Virtualization support */ +#define X86_FEATURE_SVML (15*32+ 2) /* "svm_lock" SVM locking MSR */ +#define X86_FEATURE_NRIPS (15*32+ 3) /* "nrip_save" SVM next_rip save */ +#define X86_FEATURE_TSCRATEMSR (15*32+ 4) /* "tsc_scale" TSC scaling support */ +#define X86_FEATURE_VMCBCLEAN (15*32+ 5) /* "vmcb_clean" VMCB clean bits support */ +#define X86_FEATURE_FLUSHBYASID (15*32+ 6) /* flush-by-ASID support */ +#define X86_FEATURE_DECODEASSISTS (15*32+ 7) /* Decode Assists support */ +#define X86_FEATURE_PAUSEFILTER (15*32+10) /* filtered pause intercept */ +#define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */ + /* * BUG word(s) */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 0498ad3..2007dfc 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -695,6 +695,8 @@ void get_cpu_cap(struct cpuinfo_x86 *c) cpuid_count(0x00000007, 0, &eax, &ebx, &ecx, &edx);
c->x86_capability[9] = ebx; + + c->x86_capability[14] = cpuid_eax(0x00000006); }
/* Extended state features: level 0x0000000d */ @@ -756,6 +758,9 @@ void get_cpu_cap(struct cpuinfo_x86 *c) if (c->extended_cpuid_level >= 0x80000007) c->x86_power = cpuid_edx(0x80000007);
+ if (c->extended_cpuid_level >= 0x8000000a) + c->x86_capability[15] = cpuid_edx(0x8000000a); + init_scattered_cpuid_features(c); }
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 608fb26..8cb57df 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -31,32 +31,12 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) const struct cpuid_bit *cb;
static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_DTHERM, CR_EAX, 0, 0x00000006, 0 }, - { X86_FEATURE_IDA, CR_EAX, 1, 0x00000006, 0 }, - { X86_FEATURE_ARAT, CR_EAX, 2, 0x00000006, 0 }, - { X86_FEATURE_PLN, CR_EAX, 4, 0x00000006, 0 }, - { X86_FEATURE_PTS, CR_EAX, 6, 0x00000006, 0 }, - { X86_FEATURE_HWP, CR_EAX, 7, 0x00000006, 0 }, - { X86_FEATURE_HWP_NOTIFY, CR_EAX, 8, 0x00000006, 0 }, - { X86_FEATURE_HWP_ACT_WINDOW, CR_EAX, 9, 0x00000006, 0 }, - { X86_FEATURE_HWP_EPP, CR_EAX,10, 0x00000006, 0 }, - { X86_FEATURE_HWP_PKG_REQ, CR_EAX,11, 0x00000006, 0 }, { X86_FEATURE_INTEL_PT, CR_EBX,25, 0x00000007, 0 }, { X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x00000006, 0 }, { X86_FEATURE_EPB, CR_ECX, 3, 0x00000006, 0 }, { X86_FEATURE_HW_PSTATE, CR_EDX, 7, 0x80000007, 0 }, { X86_FEATURE_CPB, CR_EDX, 9, 0x80000007, 0 }, { X86_FEATURE_PROC_FEEDBACK, CR_EDX,11, 0x80000007, 0 }, - { X86_FEATURE_NPT, CR_EDX, 0, 0x8000000a, 0 }, - { X86_FEATURE_LBRV, CR_EDX, 1, 0x8000000a, 0 }, - { X86_FEATURE_SVML, CR_EDX, 2, 0x8000000a, 0 }, - { X86_FEATURE_NRIPS, CR_EDX, 3, 0x8000000a, 0 }, - { X86_FEATURE_TSCRATEMSR, CR_EDX, 4, 0x8000000a, 0 }, - { X86_FEATURE_VMCBCLEAN, CR_EDX, 5, 0x8000000a, 0 }, - { X86_FEATURE_FLUSHBYASID, CR_EDX, 6, 0x8000000a, 0 }, - { X86_FEATURE_DECODEASSISTS, CR_EDX, 7, 0x8000000a, 0 }, - { X86_FEATURE_PAUSEFILTER, CR_EDX,10, 0x8000000a, 0 }, - { X86_FEATURE_PFTHRESHOLD, CR_EDX,12, 0x8000000a, 0 }, { 0, 0, 0, 0, 0 } };
From: Borislav Petkov bp@suse.de
commit 39c06df4dc10a41de5fe706f4378ee5f09beba73 upstream
Add an enum for the ->x86_capability array indices and cleanup get_cpu_cap() by killing some redundant local vars.
Signed-off-by: Borislav Petkov bp@suse.de Link: http://lkml.kernel.org/r/1449481182-27541-3-git-send-email-bp@alien8.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 20 ++++++++++++++++ arch/x86/kernel/cpu/centaur.c | 2 +- arch/x86/kernel/cpu/common.c | 47 +++++++++++++++++-------------------- arch/x86/kernel/cpu/transmeta.c | 4 ++- 4 files changed, 45 insertions(+), 28 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 999e166..54c7a0b 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -299,6 +299,26 @@ #include <asm/asm.h> #include <linux/bitops.h>
+enum cpuid_leafs +{ + CPUID_1_EDX = 0, + CPUID_8000_0001_EDX, + CPUID_8086_0001_EDX, + CPUID_LNX_1, + CPUID_1_ECX, + CPUID_C000_0001_EDX, + CPUID_8000_0001_ECX, + CPUID_LNX_2, + CPUID_LNX_3, + CPUID_7_0_EBX, + CPUID_D_1_EAX, + CPUID_F_0_EDX, + CPUID_F_1_EDX, + CPUID_8000_0008_EBX, + CPUID_6_EAX, + CPUID_8000_000A_EDX, +}; + #ifdef CONFIG_X86_FEATURE_NAMES extern const char * const x86_cap_flags[NCAPINTS*32]; extern const char * const x86_power_flags[32]; diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c index d8fba5c..ae20be6 100644 --- a/arch/x86/kernel/cpu/centaur.c +++ b/arch/x86/kernel/cpu/centaur.c @@ -43,7 +43,7 @@ static void init_c3(struct cpuinfo_x86 *c) /* store Centaur Extended Feature Flags as * word 5 of the CPU capability bit array */ - c->x86_capability[5] = cpuid_edx(0xC0000001); + c->x86_capability[CPUID_C000_0001_EDX] = cpuid_edx(0xC0000001); } #ifdef CONFIG_X86_32 /* Cyrix III family needs CX8 & PGE explicitly enabled. */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 2007dfc..5b6e43b 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -676,52 +676,47 @@ static void apply_forced_caps(struct cpuinfo_x86 *c)
void get_cpu_cap(struct cpuinfo_x86 *c) { - u32 tfms, xlvl; - u32 ebx; + u32 eax, ebx, ecx, edx;
/* Intel-defined flags: level 0x00000001 */ if (c->cpuid_level >= 0x00000001) { - u32 capability, excap; + cpuid(0x00000001, &eax, &ebx, &ecx, &edx);
- cpuid(0x00000001, &tfms, &ebx, &excap, &capability); - c->x86_capability[0] = capability; - c->x86_capability[4] = excap; + c->x86_capability[CPUID_1_ECX] = ecx; + c->x86_capability[CPUID_1_EDX] = edx; }
/* Additional Intel-defined flags: level 0x00000007 */ if (c->cpuid_level >= 0x00000007) { - u32 eax, ebx, ecx, edx; - cpuid_count(0x00000007, 0, &eax, &ebx, &ecx, &edx);
- c->x86_capability[9] = ebx; + c->x86_capability[CPUID_7_0_EBX] = ebx;
- c->x86_capability[14] = cpuid_eax(0x00000006); + c->x86_capability[CPUID_6_EAX] = cpuid_eax(0x00000006); }
/* Extended state features: level 0x0000000d */ if (c->cpuid_level >= 0x0000000d) { - u32 eax, ebx, ecx, edx; - cpuid_count(0x0000000d, 1, &eax, &ebx, &ecx, &edx);
- c->x86_capability[10] = eax; + c->x86_capability[CPUID_D_1_EAX] = eax; }
/* Additional Intel-defined flags: level 0x0000000F */ if (c->cpuid_level >= 0x0000000F) { - u32 eax, ebx, ecx, edx;
/* QoS sub-leaf, EAX=0Fh, ECX=0 */ cpuid_count(0x0000000F, 0, &eax, &ebx, &ecx, &edx); - c->x86_capability[11] = edx; + c->x86_capability[CPUID_F_0_EDX] = edx; + if (cpu_has(c, X86_FEATURE_CQM_LLC)) { /* will be overridden if occupancy monitoring exists */ c->x86_cache_max_rmid = ebx;
/* QoS sub-leaf, EAX=0Fh, ECX=1 */ cpuid_count(0x0000000F, 1, &eax, &ebx, &ecx, &edx); - c->x86_capability[12] = edx; + c->x86_capability[CPUID_F_1_EDX] = edx; + if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC)) { c->x86_cache_max_rmid = ecx; c->x86_cache_occ_scale = ebx; @@ -733,22 +728,24 @@ void get_cpu_cap(struct cpuinfo_x86 *c) }
/* AMD-defined flags: level 0x80000001 */ - xlvl = cpuid_eax(0x80000000); - c->extended_cpuid_level = xlvl; + eax = cpuid_eax(0x80000000); + c->extended_cpuid_level = eax; + + if ((eax & 0xffff0000) == 0x80000000) { + if (eax >= 0x80000001) { + cpuid(0x80000001, &eax, &ebx, &ecx, &edx);
- if ((xlvl & 0xffff0000) == 0x80000000) { - if (xlvl >= 0x80000001) { - c->x86_capability[1] = cpuid_edx(0x80000001); - c->x86_capability[6] = cpuid_ecx(0x80000001); + c->x86_capability[CPUID_8000_0001_ECX] = ecx; + c->x86_capability[CPUID_8000_0001_EDX] = edx; } }
if (c->extended_cpuid_level >= 0x80000008) { - u32 eax = cpuid_eax(0x80000008); + cpuid(0x80000008, &eax, &ebx, &ecx, &edx);
c->x86_virt_bits = (eax >> 8) & 0xff; c->x86_phys_bits = eax & 0xff; - c->x86_capability[13] = cpuid_ebx(0x80000008); + c->x86_capability[CPUID_8000_0008_EBX] = ebx; } #ifdef CONFIG_X86_32 else if (cpu_has(c, X86_FEATURE_PAE) || cpu_has(c, X86_FEATURE_PSE36)) @@ -759,7 +756,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c) c->x86_power = cpuid_edx(0x80000007);
if (c->extended_cpuid_level >= 0x8000000a) - c->x86_capability[15] = cpuid_edx(0x8000000a); + c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
init_scattered_cpuid_features(c); } diff --git a/arch/x86/kernel/cpu/transmeta.c b/arch/x86/kernel/cpu/transmeta.c index 3fa0e5a..252da7a 100644 --- a/arch/x86/kernel/cpu/transmeta.c +++ b/arch/x86/kernel/cpu/transmeta.c @@ -12,7 +12,7 @@ static void early_init_transmeta(struct cpuinfo_x86 *c) xlvl = cpuid_eax(0x80860000); if ((xlvl & 0xffff0000) == 0x80860000) { if (xlvl >= 0x80860001) - c->x86_capability[2] = cpuid_edx(0x80860001); + c->x86_capability[CPUID_8086_0001_EDX] = cpuid_edx(0x80860001); } }
@@ -82,7 +82,7 @@ static void init_transmeta(struct cpuinfo_x86 *c) /* Unhide possibly hidden capability flags */ rdmsr(0x80860004, cap_mask, uk); wrmsr(0x80860004, ~0, uk); - c->x86_capability[0] = cpuid_edx(0x00000001); + c->x86_capability[CPUID_1_EDX] = cpuid_edx(0x00000001); wrmsr(0x80860004, cap_mask, uk);
/* All Transmeta CPUs have a constant TSC */
From: Borislav Petkov bp@suse.de
commit 6e1315fe82308cd29e7550eab967262e8bbc71a3 upstream
This brings .text savings of about ~1.6K when building a tinyconfig. It is off by default so nothing changes for the default.
Kconfig help text from Josh.
Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Josh Triplett josh@joshtriplett.org Link: http://lkml.kernel.org/r/1449481182-27541-5-git-send-email-bp@alien8.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/Kconfig | 11 +++++++++++ arch/x86/include/asm/cpufeature.h | 2 +- 2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index eab1ef2..d9afe6d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -346,6 +346,17 @@ config X86_FEATURE_NAMES
If in doubt, say Y.
+config X86_FAST_FEATURE_TESTS + bool "Fast CPU feature tests" if EMBEDDED + default y + ---help--- + Some fast-paths in the kernel depend on the capabilities of the CPU. + Say Y here for the kernel to patch in the appropriate code at runtime + based on the capabilities of the CPU. The infrastructure for patching + code at runtime takes up some additional space; space-constrained + embedded systems may wish to say N here to produce smaller, slightly + slower code. + config X86_X2APIC bool "Support x2apic" depends on X86_LOCAL_APIC && X86_64 && (IRQ_REMAP || HYPERVISOR_GUEST) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 54c7a0b..67f917b 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -422,7 +422,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; * fast paths and boot_cpu_has() otherwise! */
-#if __GNUC__ >= 4 +#if __GNUC__ >= 4 && defined(CONFIG_X86_FAST_FEATURE_TESTS) extern void warn_pre_alternatives(void); extern bool __static_cpu_has_safe(u16 bit);
From: Borislav Petkov bp@suse.de
commit b74a0cf1b3db30173eefa00c411775d2b1697700 upstream
Add an XSTATE_OP() macro which contains the XSAVE* fault handling and replace all non-alternatives users of xstate_fault() with it.
This fixes also the buglet in copy_xregs_to_user() and copy_user_to_xregs() where the inline asm didn't have @xstate as memory reference and thus potentially causing unwanted reordering of accesses to the extended state.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Fenghua Yu fenghua.yu@intel.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Oleg Nesterov oleg@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Quentin Casasnovas quentin.casasnovas@oracle.com Cc: Rik van Riel riel@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1447932326-4371-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/fpu/internal.h | 68 ++++++++++++++++------------------- 1 file changed, 31 insertions(+), 37 deletions(-)
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 146d838..839e7b6 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -238,6 +238,20 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu) _ASM_EXTABLE(1b, 3b) \ : [_err] "=r" (__err)
+#define XSTATE_OP(op, st, lmask, hmask, err) \ + asm volatile("1:" op "\n\t" \ + "xor %[err], %[err]\n" \ + "2:\n\t" \ + ".pushsection .fixup,"ax"\n\t" \ + "3: movl $-2,%[err]\n\t" \ + "jmp 2b\n\t" \ + ".popsection\n\t" \ + _ASM_EXTABLE(1b, 3b) \ + : [err] "=r" (err) \ + : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ + : "memory") + + /* * This function is called only during boot time when x86 caps are not set * up and alternative can not be used yet. @@ -247,22 +261,14 @@ static inline void copy_xregs_to_kernel_booting(struct xregs_state *xstate) u64 mask = -1; u32 lmask = mask; u32 hmask = mask >> 32; - int err = 0; + int err;
WARN_ON(system_state != SYSTEM_BOOTING);
- if (boot_cpu_has(X86_FEATURE_XSAVES)) - asm volatile("1:"XSAVES"\n\t" - "2:\n\t" - xstate_fault(err) - : "D" (xstate), "m" (*xstate), "a" (lmask), "d" (hmask), "0" (err) - : "memory"); + if (static_cpu_has_safe(X86_FEATURE_XSAVES)) + XSTATE_OP(XSAVES, xstate, lmask, hmask, err); else - asm volatile("1:"XSAVE"\n\t" - "2:\n\t" - xstate_fault(err) - : "D" (xstate), "m" (*xstate), "a" (lmask), "d" (hmask), "0" (err) - : "memory"); + XSTATE_OP(XSAVE, xstate, lmask, hmask, err);
/* We should never fault when copying to a kernel buffer: */ WARN_ON_FPU(err); @@ -277,22 +283,14 @@ static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate) u64 mask = -1; u32 lmask = mask; u32 hmask = mask >> 32; - int err = 0; + int err;
WARN_ON(system_state != SYSTEM_BOOTING);
- if (boot_cpu_has(X86_FEATURE_XSAVES)) - asm volatile("1:"XRSTORS"\n\t" - "2:\n\t" - xstate_fault(err) - : "D" (xstate), "m" (*xstate), "a" (lmask), "d" (hmask), "0" (err) - : "memory"); + if (static_cpu_has_safe(X86_FEATURE_XSAVES)) + XSTATE_OP(XRSTORS, xstate, lmask, hmask, err); else - asm volatile("1:"XRSTOR"\n\t" - "2:\n\t" - xstate_fault(err) - : "D" (xstate), "m" (*xstate), "a" (lmask), "d" (hmask), "0" (err) - : "memory"); + XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);
/* We should never fault when copying from a kernel buffer: */ WARN_ON_FPU(err); @@ -389,12 +387,10 @@ static inline int copy_xregs_to_user(struct xregs_state __user *buf) if (unlikely(err)) return -EFAULT;
- __asm__ __volatile__(ASM_STAC "\n" - "1:"XSAVE"\n" - "2: " ASM_CLAC "\n" - xstate_fault(err) - : "D" (buf), "a" (-1), "d" (-1), "0" (err) - : "memory"); + stac(); + XSTATE_OP(XSAVE, buf, -1, -1, err); + clac(); + return err; }
@@ -406,14 +402,12 @@ static inline int copy_user_to_xregs(struct xregs_state __user *buf, u64 mask) struct xregs_state *xstate = ((__force struct xregs_state *)buf); u32 lmask = mask; u32 hmask = mask >> 32; - int err = 0; + int err; + + stac(); + XSTATE_OP(XRSTOR, xstate, lmask, hmask, err); + clac();
- __asm__ __volatile__(ASM_STAC "\n" - "1:"XRSTOR"\n" - "2: " ASM_CLAC "\n" - xstate_fault(err) - : "D" (xstate), "a" (lmask), "d" (hmask), "0" (err) - : "memory"); /* memory required? */ return err; }
From: Borislav Petkov bp@suse.de
commit b7106fa0f29f9fd83d2d1905ab690d334ef855c1 upstream
Add macros for the alternative XSAVE*/XRSTOR* operations which contain the fault handling and use them. Kill xstate_fault().
Also, copy_xregs_to_kernel() didn't have the extended state as memory reference in the asm.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Fenghua Yu fenghua.yu@intel.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Oleg Nesterov oleg@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Quentin Casasnovas quentin.casasnovas@oracle.com Cc: Rik van Riel riel@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1447932326-4371-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/fpu/internal.h | 105 +++++++++++++++++------------------ 1 file changed, 52 insertions(+), 53 deletions(-)
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 839e7b6..60c984a 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -225,19 +225,6 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu) #define XRSTOR ".byte " REX_PREFIX "0x0f,0xae,0x2f" #define XRSTORS ".byte " REX_PREFIX "0x0f,0xc7,0x1f"
-/* xstate instruction fault handler: */ -#define xstate_fault(__err) \ - \ - ".section .fixup,"ax"\n" \ - \ - "3: movl $-2,%[_err]\n" \ - " jmp 2b\n" \ - \ - ".previous\n" \ - \ - _ASM_EXTABLE(1b, 3b) \ - : [_err] "=r" (__err) - #define XSTATE_OP(op, st, lmask, hmask, err) \ asm volatile("1:" op "\n\t" \ "xor %[err], %[err]\n" \ @@ -251,6 +238,54 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu) : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ : "memory")
+/* + * If XSAVES is enabled, it replaces XSAVEOPT because it supports a compact + * format and supervisor states in addition to modified optimization in + * XSAVEOPT. + * + * Otherwise, if XSAVEOPT is enabled, XSAVEOPT replaces XSAVE because XSAVEOPT + * supports modified optimization which is not supported by XSAVE. + * + * We use XSAVE as a fallback. + * + * The 661 label is defined in the ALTERNATIVE* macros as the address of the + * original instruction which gets replaced. We need to use it here as the + * address of the instruction where we might get an exception at. + */ +#define XSTATE_XSAVE(st, lmask, hmask, err) \ + asm volatile(ALTERNATIVE_2(XSAVE, \ + XSAVEOPT, X86_FEATURE_XSAVEOPT, \ + XSAVES, X86_FEATURE_XSAVES) \ + "\n" \ + "xor %[err], %[err]\n" \ + "3:\n" \ + ".pushsection .fixup,"ax"\n" \ + "4: movl $-2, %[err]\n" \ + "jmp 3b\n" \ + ".popsection\n" \ + _ASM_EXTABLE(661b, 4b) \ + : [err] "=r" (err) \ + : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ + : "memory") + +/* + * Use XRSTORS to restore context if it is enabled. XRSTORS supports compact + * XSAVE area format. + */ +#define XSTATE_XRESTORE(st, lmask, hmask, err) \ + asm volatile(ALTERNATIVE(XRSTOR, \ + XRSTORS, X86_FEATURE_XSAVES) \ + "\n" \ + "xor %[err], %[err]\n" \ + "3:\n" \ + ".pushsection .fixup,"ax"\n" \ + "4: movl $-2, %[err]\n" \ + "jmp 3b\n" \ + ".popsection\n" \ + _ASM_EXTABLE(661b, 4b) \ + : [err] "=r" (err) \ + : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ + : "memory")
/* * This function is called only during boot time when x86 caps are not set @@ -304,33 +339,11 @@ static inline void copy_xregs_to_kernel(struct xregs_state *xstate) u64 mask = -1; u32 lmask = mask; u32 hmask = mask >> 32; - int err = 0; + int err;
WARN_ON(!alternatives_patched);
- /* - * If xsaves is enabled, xsaves replaces xsaveopt because - * it supports compact format and supervisor states in addition to - * modified optimization in xsaveopt. - * - * Otherwise, if xsaveopt is enabled, xsaveopt replaces xsave - * because xsaveopt supports modified optimization which is not - * supported by xsave. - * - * If none of xsaves and xsaveopt is enabled, use xsave. - */ - alternative_input_2( - "1:"XSAVE, - XSAVEOPT, - X86_FEATURE_XSAVEOPT, - XSAVES, - X86_FEATURE_XSAVES, - [xstate] "D" (xstate), "a" (lmask), "d" (hmask) : - "memory"); - asm volatile("2:\n\t" - xstate_fault(err) - : "0" (err) - : "memory"); + XSTATE_XSAVE(xstate, lmask, hmask, err);
/* We should never fault when copying to a kernel buffer: */ WARN_ON_FPU(err); @@ -343,23 +356,9 @@ static inline void copy_kernel_to_xregs(struct xregs_state *xstate, u64 mask) { u32 lmask = mask; u32 hmask = mask >> 32; - int err = 0; + int err;
- /* - * Use xrstors to restore context if it is enabled. xrstors supports - * compacted format of xsave area which is not supported by xrstor. - */ - alternative_input( - "1: " XRSTOR, - XRSTORS, - X86_FEATURE_XSAVES, - "D" (xstate), "m" (*xstate), "a" (lmask), "d" (hmask) - : "memory"); - - asm volatile("2:\n" - xstate_fault(err) - : "0" (err) - : "memory"); + XSTATE_XRESTORE(xstate, lmask, hmask, err);
/* We should never fault when copying from a kernel buffer: */ WARN_ON_FPU(err);
From: Andi Kleen ak@linux.intel.com
commit 153a4334c439cfb62e1d31cee0c790ba4157813d upstream
asm/atomic.h doesn't really need asm/processor.h anymore. Everything it uses has moved to other header files. So remove that include.
processor.h is a nasty header that includes lots of other headers and makes it prone to include loops. Removing the include here makes asm/atomic.h a "leaf" header that can be safely included in most other headers.
The only fallout is in the lib/atomic tester which relied on this implicit include. Give it an explicit include. (the include is in ifdef because the user is also in ifdef)
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mike Galbraith efault@gmx.de Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1449018060-1742-1-git-send-email-andi@firstfloor.or... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/atomic.h | 1 - arch/x86/include/asm/atomic64_32.h | 1 - lib/atomic64_test.c | 4 ++++ 3 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h index ae5fb83..3e86742 100644 --- a/arch/x86/include/asm/atomic.h +++ b/arch/x86/include/asm/atomic.h @@ -3,7 +3,6 @@
#include <linux/compiler.h> #include <linux/types.h> -#include <asm/processor.h> #include <asm/alternative.h> #include <asm/cmpxchg.h> #include <asm/rmwcc.h> diff --git a/arch/x86/include/asm/atomic64_32.h b/arch/x86/include/asm/atomic64_32.h index a11c30b..a984111 100644 --- a/arch/x86/include/asm/atomic64_32.h +++ b/arch/x86/include/asm/atomic64_32.h @@ -3,7 +3,6 @@
#include <linux/compiler.h> #include <linux/types.h> -#include <asm/processor.h> //#include <asm/cmpxchg.h>
/* An 64bit atomic type */ diff --git a/lib/atomic64_test.c b/lib/atomic64_test.c index 83c33a5b..d51e25a 100644 --- a/lib/atomic64_test.c +++ b/lib/atomic64_test.c @@ -16,6 +16,10 @@ #include <linux/kernel.h> #include <linux/atomic.h>
+#ifdef CONFIG_X86 +#include <asm/processor.h> /* for boot_cpu_has below */ +#endif + #define TEST(bit, op, c_op, val) \ do { \ atomic##bit##_set(&v, v0); \
From: Borislav Petkov bp@suse.de
commit cd4d09ec6f6c12a2cc3db5b7d8876a325a53545b upstream
Move them to a separate header and have the following dependency:
x86/cpufeatures.h <- x86/processor.h <- x86/cpufeature.h
This makes it easier to use the header in asm code and not include the whole cpufeature.h and add guards for asm.
Suggested-by: H. Peter Anvin hpa@zytor.com Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1453842730-28463-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
Documentation/kernel-parameters.txt | 2 arch/x86/boot/cpuflags.h | 2 arch/x86/boot/mkcpustr.c | 2 arch/x86/crypto/crc32-pclmul_glue.c | 2 arch/x86/crypto/crc32c-intel_glue.c | 2 arch/x86/crypto/crct10dif-pclmul_glue.c | 2 arch/x86/entry/common.c | 1 arch/x86/entry/entry_32.S | 2 arch/x86/entry/vdso/vdso32-setup.c | 1 arch/x86/entry/vdso/vdso32/system_call.S | 2 arch/x86/entry/vdso/vma.c | 1 arch/x86/include/asm/alternative.h | 6 - arch/x86/include/asm/apic.h | 1 arch/x86/include/asm/arch_hweight.h | 2 arch/x86/include/asm/cmpxchg.h | 1 arch/x86/include/asm/cpufeature.h | 293 ------------------------------ arch/x86/include/asm/cpufeatures.h | 297 ++++++++++++++++++++++++++++++ arch/x86/include/asm/fpu/internal.h | 1 arch/x86/include/asm/irq_work.h | 2 arch/x86/include/asm/mwait.h | 2 arch/x86/include/asm/nospec-branch.h | 2 arch/x86/include/asm/processor.h | 3 arch/x86/include/asm/smap.h | 2 arch/x86/include/asm/smp.h | 1 arch/x86/include/asm/thread_info.h | 2 arch/x86/include/asm/tlbflush.h | 1 arch/x86/include/asm/uaccess_64.h | 2 arch/x86/kernel/cpu/Makefile | 2 arch/x86/kernel/cpu/centaur.c | 2 arch/x86/kernel/cpu/cyrix.c | 1 arch/x86/kernel/cpu/intel.c | 2 arch/x86/kernel/cpu/intel_cacheinfo.c | 2 arch/x86/kernel/cpu/match.c | 2 arch/x86/kernel/cpu/mkcapflags.sh | 6 - arch/x86/kernel/cpu/mtrr/main.c | 2 arch/x86/kernel/cpu/transmeta.c | 2 arch/x86/kernel/e820.c | 1 arch/x86/kernel/head_32.S | 2 arch/x86/kernel/hpet.c | 1 arch/x86/kernel/msr.c | 2 arch/x86/kernel/verify_cpu.S | 2 arch/x86/lib/clear_page_64.S | 2 arch/x86/lib/copy_page_64.S | 2 arch/x86/lib/copy_user_64.S | 2 arch/x86/lib/memcpy_64.S | 2 arch/x86/lib/memmove_64.S | 2 arch/x86/lib/memset_64.S | 2 arch/x86/lib/retpoline.S | 2 arch/x86/mm/setup_nx.c | 1 arch/x86/oprofile/op_model_amd.c | 1 arch/x86/um/asm/barrier.h | 2 lib/atomic64_test.c | 2 52 files changed, 347 insertions(+), 339 deletions(-) create mode 100644 arch/x86/include/asm/cpufeatures.h
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 4df6bd7..e60d0b5 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -652,7 +652,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
clearcpuid=BITNUM [X86] Disable CPUID feature X for the kernel. See - arch/x86/include/asm/cpufeature.h for the valid bit + arch/x86/include/asm/cpufeatures.h for the valid bit numbers. Note the Linux specific bits are not necessarily stable over kernel options, but the vendor specific ones should be. diff --git a/arch/x86/boot/cpuflags.h b/arch/x86/boot/cpuflags.h index ea97697..4cb404f 100644 --- a/arch/x86/boot/cpuflags.h +++ b/arch/x86/boot/cpuflags.h @@ -1,7 +1,7 @@ #ifndef BOOT_CPUFLAGS_H #define BOOT_CPUFLAGS_H
-#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/processor-flags.h>
struct cpu_features { diff --git a/arch/x86/boot/mkcpustr.c b/arch/x86/boot/mkcpustr.c index 637097e..f72498d 100644 --- a/arch/x86/boot/mkcpustr.c +++ b/arch/x86/boot/mkcpustr.c @@ -17,7 +17,7 @@
#include "../include/asm/required-features.h" #include "../include/asm/disabled-features.h" -#include "../include/asm/cpufeature.h" +#include "../include/asm/cpufeatures.h" #include "../kernel/cpu/capflags.c"
int main(void) diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index 07d2c6c..27226df 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -33,7 +33,7 @@ #include <linux/crc32.h> #include <crypto/internal/hash.h>
-#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/cpu_device_id.h> #include <asm/fpu/api.h>
diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index 15f5c76..715399b 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -30,7 +30,7 @@ #include <linux/kernel.h> #include <crypto/internal/hash.h>
-#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/cpu_device_id.h> #include <asm/fpu/internal.h>
diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index a3fcfc9..cd4df93 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -30,7 +30,7 @@ #include <linux/string.h> #include <linux/kernel.h> #include <asm/fpu/api.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/cpu_device_id.h>
asmlinkage __u16 crc_t10dif_pcl(__u16 crc, const unsigned char *buf, diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c index b5eb1cc..071582a 100644 --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -27,6 +27,7 @@ #include <asm/traps.h> #include <asm/vdso.h> #include <asm/uaccess.h> +#include <asm/cpufeature.h>
#define CREATE_TRACE_POINTS #include <trace/events/syscalls.h> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index d437f387..49a8c9f 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -40,7 +40,7 @@ #include <asm/processor-flags.h> #include <asm/ftrace.h> #include <asm/irq_vectors.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/alternative-asm.h> #include <asm/asm.h> #include <asm/smap.h> diff --git a/arch/x86/entry/vdso/vdso32-setup.c b/arch/x86/entry/vdso/vdso32-setup.c index a7508d7..3f9d1a8 100644 --- a/arch/x86/entry/vdso/vdso32-setup.c +++ b/arch/x86/entry/vdso/vdso32-setup.c @@ -11,7 +11,6 @@ #include <linux/kernel.h> #include <linux/mm_types.h>
-#include <asm/cpufeature.h> #include <asm/processor.h> #include <asm/vdso.h>
diff --git a/arch/x86/entry/vdso/vdso32/system_call.S b/arch/x86/entry/vdso/vdso32/system_call.S index 3a1d929..0109ac6 100644 --- a/arch/x86/entry/vdso/vdso32/system_call.S +++ b/arch/x86/entry/vdso/vdso32/system_call.S @@ -3,7 +3,7 @@ */
#include <asm/dwarf2.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/alternative-asm.h>
/* diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c index b8f69e2..5471ac3 100644 --- a/arch/x86/entry/vdso/vma.c +++ b/arch/x86/entry/vdso/vma.c @@ -20,6 +20,7 @@ #include <asm/page.h> #include <asm/hpet.h> #include <asm/desc.h> +#include <asm/cpufeature.h>
#if defined(CONFIG_X86_64) unsigned int __read_mostly vdso64_enabled = 1; diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 215ea92..002fcd9 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -154,12 +154,6 @@ static inline int alternatives_text_reserved(void *start, void *end) ".popsection\n"
/* - * This must be included *after* the definition of ALTERNATIVE due to - * <asm/arch_hweight.h> - */ -#include <asm/cpufeature.h> - -/* * Alternative instructions for different CPU types or capabilities. * * This allows to use optimized instructions even on generic binary diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 163769d..fd810a5 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -6,7 +6,6 @@
#include <asm/alternative.h> #include <asm/cpufeature.h> -#include <asm/processor.h> #include <asm/apicdef.h> #include <linux/atomic.h> #include <asm/fixmap.h> diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h index 44f825c..e7cd631 100644 --- a/arch/x86/include/asm/arch_hweight.h +++ b/arch/x86/include/asm/arch_hweight.h @@ -1,6 +1,8 @@ #ifndef _ASM_X86_HWEIGHT_H #define _ASM_X86_HWEIGHT_H
+#include <asm/cpufeatures.h> + #ifdef CONFIG_64BIT /* popcnt %edi, %eax */ #define POPCNT32 ".byte 0xf3,0x0f,0xb8,0xc7" diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h index ad19841..9733361 100644 --- a/arch/x86/include/asm/cmpxchg.h +++ b/arch/x86/include/asm/cmpxchg.h @@ -2,6 +2,7 @@ #define ASM_X86_CMPXCHG_H
#include <linux/compiler.h> +#include <asm/cpufeatures.h> #include <asm/alternative.h> /* Provides LOCK_PREFIX */
/* diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 67f917b..f62e872 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -1,298 +1,7 @@ -/* - * Defines x86 CPU feature bits - */ #ifndef _ASM_X86_CPUFEATURE_H #define _ASM_X86_CPUFEATURE_H
-#ifndef _ASM_X86_REQUIRED_FEATURES_H -#include <asm/required-features.h> -#endif - -#ifndef _ASM_X86_DISABLED_FEATURES_H -#include <asm/disabled-features.h> -#endif - -#define NCAPINTS 16 /* N 32-bit words worth of info */ -#define NBUGINTS 1 /* N 32-bit bug flags */ - -/* - * Note: If the comment begins with a quoted string, that string is used - * in /proc/cpuinfo instead of the macro name. If the string is "", - * this feature bit is not displayed in /proc/cpuinfo at all. - */ - -/* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */ -#define X86_FEATURE_FPU ( 0*32+ 0) /* Onboard FPU */ -#define X86_FEATURE_VME ( 0*32+ 1) /* Virtual Mode Extensions */ -#define X86_FEATURE_DE ( 0*32+ 2) /* Debugging Extensions */ -#define X86_FEATURE_PSE ( 0*32+ 3) /* Page Size Extensions */ -#define X86_FEATURE_TSC ( 0*32+ 4) /* Time Stamp Counter */ -#define X86_FEATURE_MSR ( 0*32+ 5) /* Model-Specific Registers */ -#define X86_FEATURE_PAE ( 0*32+ 6) /* Physical Address Extensions */ -#define X86_FEATURE_MCE ( 0*32+ 7) /* Machine Check Exception */ -#define X86_FEATURE_CX8 ( 0*32+ 8) /* CMPXCHG8 instruction */ -#define X86_FEATURE_APIC ( 0*32+ 9) /* Onboard APIC */ -#define X86_FEATURE_SEP ( 0*32+11) /* SYSENTER/SYSEXIT */ -#define X86_FEATURE_MTRR ( 0*32+12) /* Memory Type Range Registers */ -#define X86_FEATURE_PGE ( 0*32+13) /* Page Global Enable */ -#define X86_FEATURE_MCA ( 0*32+14) /* Machine Check Architecture */ -#define X86_FEATURE_CMOV ( 0*32+15) /* CMOV instructions */ - /* (plus FCMOVcc, FCOMI with FPU) */ -#define X86_FEATURE_PAT ( 0*32+16) /* Page Attribute Table */ -#define X86_FEATURE_PSE36 ( 0*32+17) /* 36-bit PSEs */ -#define X86_FEATURE_PN ( 0*32+18) /* Processor serial number */ -#define X86_FEATURE_CLFLUSH ( 0*32+19) /* CLFLUSH instruction */ -#define X86_FEATURE_DS ( 0*32+21) /* "dts" Debug Store */ -#define X86_FEATURE_ACPI ( 0*32+22) /* ACPI via MSR */ -#define X86_FEATURE_MMX ( 0*32+23) /* Multimedia Extensions */ -#define X86_FEATURE_FXSR ( 0*32+24) /* FXSAVE/FXRSTOR, CR4.OSFXSR */ -#define X86_FEATURE_XMM ( 0*32+25) /* "sse" */ -#define X86_FEATURE_XMM2 ( 0*32+26) /* "sse2" */ -#define X86_FEATURE_SELFSNOOP ( 0*32+27) /* "ss" CPU self snoop */ -#define X86_FEATURE_HT ( 0*32+28) /* Hyper-Threading */ -#define X86_FEATURE_ACC ( 0*32+29) /* "tm" Automatic clock control */ -#define X86_FEATURE_IA64 ( 0*32+30) /* IA-64 processor */ -#define X86_FEATURE_PBE ( 0*32+31) /* Pending Break Enable */ - -/* AMD-defined CPU features, CPUID level 0x80000001, word 1 */ -/* Don't duplicate feature flags which are redundant with Intel! */ -#define X86_FEATURE_SYSCALL ( 1*32+11) /* SYSCALL/SYSRET */ -#define X86_FEATURE_MP ( 1*32+19) /* MP Capable. */ -#define X86_FEATURE_NX ( 1*32+20) /* Execute Disable */ -#define X86_FEATURE_MMXEXT ( 1*32+22) /* AMD MMX extensions */ -#define X86_FEATURE_FXSR_OPT ( 1*32+25) /* FXSAVE/FXRSTOR optimizations */ -#define X86_FEATURE_GBPAGES ( 1*32+26) /* "pdpe1gb" GB pages */ -#define X86_FEATURE_RDTSCP ( 1*32+27) /* RDTSCP */ -#define X86_FEATURE_LM ( 1*32+29) /* Long Mode (x86-64) */ -#define X86_FEATURE_3DNOWEXT ( 1*32+30) /* AMD 3DNow! extensions */ -#define X86_FEATURE_3DNOW ( 1*32+31) /* 3DNow! */ - -/* Transmeta-defined CPU features, CPUID level 0x80860001, word 2 */ -#define X86_FEATURE_RECOVERY ( 2*32+ 0) /* CPU in recovery mode */ -#define X86_FEATURE_LONGRUN ( 2*32+ 1) /* Longrun power control */ -#define X86_FEATURE_LRTI ( 2*32+ 3) /* LongRun table interface */ - -/* Other features, Linux-defined mapping, word 3 */ -/* This range is used for feature bits which conflict or are synthesized */ -#define X86_FEATURE_CXMMX ( 3*32+ 0) /* Cyrix MMX extensions */ -#define X86_FEATURE_K6_MTRR ( 3*32+ 1) /* AMD K6 nonstandard MTRRs */ -#define X86_FEATURE_CYRIX_ARR ( 3*32+ 2) /* Cyrix ARRs (= MTRRs) */ -#define X86_FEATURE_CENTAUR_MCR ( 3*32+ 3) /* Centaur MCRs (= MTRRs) */ -/* cpu types for specific tunings: */ -#define X86_FEATURE_K8 ( 3*32+ 4) /* "" Opteron, Athlon64 */ -#define X86_FEATURE_K7 ( 3*32+ 5) /* "" Athlon */ -#define X86_FEATURE_P3 ( 3*32+ 6) /* "" P3 */ -#define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */ -#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */ -#define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */ -/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE leaks FOP/FIP/FOP */ -#define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */ -#define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */ -#define X86_FEATURE_BTS ( 3*32+13) /* Branch Trace Store */ -#define X86_FEATURE_SYSCALL32 ( 3*32+14) /* "" syscall in ia32 userspace */ -#define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in ia32 userspace */ -#define X86_FEATURE_REP_GOOD ( 3*32+16) /* rep microcode works well */ -#define X86_FEATURE_MFENCE_RDTSC ( 3*32+17) /* "" Mfence synchronizes RDTSC */ -#define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" Lfence synchronizes RDTSC */ -/* free, was #define X86_FEATURE_11AP ( 3*32+19) * "" Bad local APIC aka 11AP */ -#define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */ -#define X86_FEATURE_ALWAYS ( 3*32+21) /* "" Always-present feature */ -#define X86_FEATURE_XTOPOLOGY ( 3*32+22) /* cpu topology enum extensions */ -#define X86_FEATURE_TSC_RELIABLE ( 3*32+23) /* TSC is known to be reliable */ -#define X86_FEATURE_NONSTOP_TSC ( 3*32+24) /* TSC does not stop in C states */ -/* free, was #define X86_FEATURE_CLFLUSH_MONITOR ( 3*32+25) * "" clflush reqd with monitor */ -#define X86_FEATURE_EXTD_APICID ( 3*32+26) /* has extended APICID (8 bits) */ -#define X86_FEATURE_AMD_DCM ( 3*32+27) /* multi-node processor */ -#define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */ -/* free, was #define X86_FEATURE_EAGER_FPU ( 3*32+29) * "eagerfpu" Non lazy FPU restore */ -#define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */ - -/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */ -#define X86_FEATURE_XMM3 ( 4*32+ 0) /* "pni" SSE-3 */ -#define X86_FEATURE_PCLMULQDQ ( 4*32+ 1) /* PCLMULQDQ instruction */ -#define X86_FEATURE_DTES64 ( 4*32+ 2) /* 64-bit Debug Store */ -#define X86_FEATURE_MWAIT ( 4*32+ 3) /* "monitor" Monitor/Mwait support */ -#define X86_FEATURE_DSCPL ( 4*32+ 4) /* "ds_cpl" CPL Qual. Debug Store */ -#define X86_FEATURE_VMX ( 4*32+ 5) /* Hardware virtualization */ -#define X86_FEATURE_SMX ( 4*32+ 6) /* Safer mode */ -#define X86_FEATURE_EST ( 4*32+ 7) /* Enhanced SpeedStep */ -#define X86_FEATURE_TM2 ( 4*32+ 8) /* Thermal Monitor 2 */ -#define X86_FEATURE_SSSE3 ( 4*32+ 9) /* Supplemental SSE-3 */ -#define X86_FEATURE_CID ( 4*32+10) /* Context ID */ -#define X86_FEATURE_SDBG ( 4*32+11) /* Silicon Debug */ -#define X86_FEATURE_FMA ( 4*32+12) /* Fused multiply-add */ -#define X86_FEATURE_CX16 ( 4*32+13) /* CMPXCHG16B */ -#define X86_FEATURE_XTPR ( 4*32+14) /* Send Task Priority Messages */ -#define X86_FEATURE_PDCM ( 4*32+15) /* Performance Capabilities */ -#define X86_FEATURE_PCID ( 4*32+17) /* Process Context Identifiers */ -#define X86_FEATURE_DCA ( 4*32+18) /* Direct Cache Access */ -#define X86_FEATURE_XMM4_1 ( 4*32+19) /* "sse4_1" SSE-4.1 */ -#define X86_FEATURE_XMM4_2 ( 4*32+20) /* "sse4_2" SSE-4.2 */ -#define X86_FEATURE_X2APIC ( 4*32+21) /* x2APIC */ -#define X86_FEATURE_MOVBE ( 4*32+22) /* MOVBE instruction */ -#define X86_FEATURE_POPCNT ( 4*32+23) /* POPCNT instruction */ -#define X86_FEATURE_TSC_DEADLINE_TIMER ( 4*32+24) /* Tsc deadline timer */ -#define X86_FEATURE_AES ( 4*32+25) /* AES instructions */ -#define X86_FEATURE_XSAVE ( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */ -#define X86_FEATURE_OSXSAVE ( 4*32+27) /* "" XSAVE enabled in the OS */ -#define X86_FEATURE_AVX ( 4*32+28) /* Advanced Vector Extensions */ -#define X86_FEATURE_F16C ( 4*32+29) /* 16-bit fp conversions */ -#define X86_FEATURE_RDRAND ( 4*32+30) /* The RDRAND instruction */ -#define X86_FEATURE_HYPERVISOR ( 4*32+31) /* Running on a hypervisor */ - -/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */ -#define X86_FEATURE_XSTORE ( 5*32+ 2) /* "rng" RNG present (xstore) */ -#define X86_FEATURE_XSTORE_EN ( 5*32+ 3) /* "rng_en" RNG enabled */ -#define X86_FEATURE_XCRYPT ( 5*32+ 6) /* "ace" on-CPU crypto (xcrypt) */ -#define X86_FEATURE_XCRYPT_EN ( 5*32+ 7) /* "ace_en" on-CPU crypto enabled */ -#define X86_FEATURE_ACE2 ( 5*32+ 8) /* Advanced Cryptography Engine v2 */ -#define X86_FEATURE_ACE2_EN ( 5*32+ 9) /* ACE v2 enabled */ -#define X86_FEATURE_PHE ( 5*32+10) /* PadLock Hash Engine */ -#define X86_FEATURE_PHE_EN ( 5*32+11) /* PHE enabled */ -#define X86_FEATURE_PMM ( 5*32+12) /* PadLock Montgomery Multiplier */ -#define X86_FEATURE_PMM_EN ( 5*32+13) /* PMM enabled */ - -/* More extended AMD flags: CPUID level 0x80000001, ecx, word 6 */ -#define X86_FEATURE_LAHF_LM ( 6*32+ 0) /* LAHF/SAHF in long mode */ -#define X86_FEATURE_CMP_LEGACY ( 6*32+ 1) /* If yes HyperThreading not valid */ -#define X86_FEATURE_SVM ( 6*32+ 2) /* Secure virtual machine */ -#define X86_FEATURE_EXTAPIC ( 6*32+ 3) /* Extended APIC space */ -#define X86_FEATURE_CR8_LEGACY ( 6*32+ 4) /* CR8 in 32-bit mode */ -#define X86_FEATURE_ABM ( 6*32+ 5) /* Advanced bit manipulation */ -#define X86_FEATURE_SSE4A ( 6*32+ 6) /* SSE-4A */ -#define X86_FEATURE_MISALIGNSSE ( 6*32+ 7) /* Misaligned SSE mode */ -#define X86_FEATURE_3DNOWPREFETCH ( 6*32+ 8) /* 3DNow prefetch instructions */ -#define X86_FEATURE_OSVW ( 6*32+ 9) /* OS Visible Workaround */ -#define X86_FEATURE_IBS ( 6*32+10) /* Instruction Based Sampling */ -#define X86_FEATURE_XOP ( 6*32+11) /* extended AVX instructions */ -#define X86_FEATURE_SKINIT ( 6*32+12) /* SKINIT/STGI instructions */ -#define X86_FEATURE_WDT ( 6*32+13) /* Watchdog timer */ -#define X86_FEATURE_LWP ( 6*32+15) /* Light Weight Profiling */ -#define X86_FEATURE_FMA4 ( 6*32+16) /* 4 operands MAC instructions */ -#define X86_FEATURE_TCE ( 6*32+17) /* translation cache extension */ -#define X86_FEATURE_NODEID_MSR ( 6*32+19) /* NodeId MSR */ -#define X86_FEATURE_TBM ( 6*32+21) /* trailing bit manipulations */ -#define X86_FEATURE_TOPOEXT ( 6*32+22) /* topology extensions CPUID leafs */ -#define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* core performance counter extensions */ -#define X86_FEATURE_PERFCTR_NB ( 6*32+24) /* NB performance counter extensions */ -#define X86_FEATURE_BPEXT (6*32+26) /* data breakpoint extension */ -#define X86_FEATURE_PERFCTR_L2 ( 6*32+28) /* L2 performance counter extensions */ -#define X86_FEATURE_MWAITX ( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) */ - -/* - * Auxiliary flags: Linux defined - For features scattered in various - * CPUID levels like 0x6, 0xA etc, word 7. - * - * Reuse free bits when adding new feature flags! - */ - -#define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */ -#define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */ -#define X86_FEATURE_INVPCID_SINGLE ( 7*32+ 4) /* Effectively INVPCID && CR4.PCIDE=1 */ - -#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */ -#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */ - -#define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */ -#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* Fill RSB on context switches */ - -#define X86_FEATURE_RETPOLINE ( 7*32+29) /* Generic Retpoline mitigation for Spectre variant 2 */ -#define X86_FEATURE_RETPOLINE_AMD ( 7*32+30) /* AMD Retpoline mitigation for Spectre variant 2 */ -/* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */ -#define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */ - -/* Virtualization flags: Linux defined, word 8 */ -#define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ -#define X86_FEATURE_VNMI ( 8*32+ 1) /* Intel Virtual NMI */ -#define X86_FEATURE_FLEXPRIORITY ( 8*32+ 2) /* Intel FlexPriority */ -#define X86_FEATURE_EPT ( 8*32+ 3) /* Intel Extended Page Table */ -#define X86_FEATURE_VPID ( 8*32+ 4) /* Intel Virtual Processor ID */ - -#define X86_FEATURE_VMMCALL ( 8*32+15) /* Prefer vmmcall to vmcall */ -#define X86_FEATURE_XENPV ( 8*32+16) /* "" Xen paravirtual guest */ - - -/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */ -#define X86_FEATURE_FSGSBASE ( 9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/ -#define X86_FEATURE_TSC_ADJUST ( 9*32+ 1) /* TSC adjustment MSR 0x3b */ -#define X86_FEATURE_BMI1 ( 9*32+ 3) /* 1st group bit manipulation extensions */ -#define X86_FEATURE_HLE ( 9*32+ 4) /* Hardware Lock Elision */ -#define X86_FEATURE_AVX2 ( 9*32+ 5) /* AVX2 instructions */ -#define X86_FEATURE_SMEP ( 9*32+ 7) /* Supervisor Mode Execution Protection */ -#define X86_FEATURE_BMI2 ( 9*32+ 8) /* 2nd group bit manipulation extensions */ -#define X86_FEATURE_ERMS ( 9*32+ 9) /* Enhanced REP MOVSB/STOSB */ -#define X86_FEATURE_INVPCID ( 9*32+10) /* Invalidate Processor Context ID */ -#define X86_FEATURE_RTM ( 9*32+11) /* Restricted Transactional Memory */ -#define X86_FEATURE_CQM ( 9*32+12) /* Cache QoS Monitoring */ -#define X86_FEATURE_MPX ( 9*32+14) /* Memory Protection Extension */ -#define X86_FEATURE_AVX512F ( 9*32+16) /* AVX-512 Foundation */ -#define X86_FEATURE_RDSEED ( 9*32+18) /* The RDSEED instruction */ -#define X86_FEATURE_ADX ( 9*32+19) /* The ADCX and ADOX instructions */ -#define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ -#define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ -#define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ -#define X86_FEATURE_CLWB ( 9*32+24) /* CLWB instruction */ -#define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ -#define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ -#define X86_FEATURE_AVX512CD ( 9*32+28) /* AVX-512 Conflict Detection */ -#define X86_FEATURE_SHA_NI ( 9*32+29) /* SHA1/SHA256 Instruction Extensions */ - -/* Extended state features, CPUID level 0x0000000d:1 (eax), word 10 */ -#define X86_FEATURE_XSAVEOPT (10*32+ 0) /* XSAVEOPT */ -#define X86_FEATURE_XSAVEC (10*32+ 1) /* XSAVEC */ -#define X86_FEATURE_XGETBV1 (10*32+ 2) /* XGETBV with ECX = 1 */ -#define X86_FEATURE_XSAVES (10*32+ 3) /* XSAVES/XRSTORS */ - -/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (edx), word 11 */ -#define X86_FEATURE_CQM_LLC (11*32+ 1) /* LLC QoS if 1 */ - -/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (edx), word 12 */ -#define X86_FEATURE_CQM_OCCUP_LLC (12*32+ 0) /* LLC occupancy monitoring if 1 */ - -/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */ -#define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */ - -/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */ -#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */ -#define X86_FEATURE_IDA (14*32+ 1) /* Intel Dynamic Acceleration */ -#define X86_FEATURE_ARAT (14*32+ 2) /* Always Running APIC Timer */ -#define X86_FEATURE_PLN (14*32+ 4) /* Intel Power Limit Notification */ -#define X86_FEATURE_PTS (14*32+ 6) /* Intel Package Thermal Status */ -#define X86_FEATURE_HWP (14*32+ 7) /* Intel Hardware P-states */ -#define X86_FEATURE_HWP_NOTIFY (14*32+ 8) /* HWP Notification */ -#define X86_FEATURE_HWP_ACT_WINDOW (14*32+ 9) /* HWP Activity Window */ -#define X86_FEATURE_HWP_EPP (14*32+10) /* HWP Energy Perf. Preference */ -#define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */ - -/* AMD SVM Feature Identification, CPUID level 0x8000000a (edx), word 15 */ -#define X86_FEATURE_NPT (15*32+ 0) /* Nested Page Table support */ -#define X86_FEATURE_LBRV (15*32+ 1) /* LBR Virtualization support */ -#define X86_FEATURE_SVML (15*32+ 2) /* "svm_lock" SVM locking MSR */ -#define X86_FEATURE_NRIPS (15*32+ 3) /* "nrip_save" SVM next_rip save */ -#define X86_FEATURE_TSCRATEMSR (15*32+ 4) /* "tsc_scale" TSC scaling support */ -#define X86_FEATURE_VMCBCLEAN (15*32+ 5) /* "vmcb_clean" VMCB clean bits support */ -#define X86_FEATURE_FLUSHBYASID (15*32+ 6) /* flush-by-ASID support */ -#define X86_FEATURE_DECODEASSISTS (15*32+ 7) /* Decode Assists support */ -#define X86_FEATURE_PAUSEFILTER (15*32+10) /* filtered pause intercept */ -#define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */ - -/* - * BUG word(s) - */ -#define X86_BUG(x) (NCAPINTS*32 + (x)) - -#define X86_BUG_F00F X86_BUG(0) /* Intel F00F */ -#define X86_BUG_FDIV X86_BUG(1) /* FPU FDIV */ -#define X86_BUG_COMA X86_BUG(2) /* Cyrix 6x86 coma */ -#define X86_BUG_AMD_TLB_MMATCH X86_BUG(3) /* "tlb_mmatch" AMD Erratum 383 */ -#define X86_BUG_AMD_APIC_C1E X86_BUG(4) /* "apic_c1e" AMD Erratum 400 */ -#define X86_BUG_11AP X86_BUG(5) /* Bad local APIC aka 11AP */ -#define X86_BUG_FXSAVE_LEAK X86_BUG(6) /* FXSAVE leaks FOP/FIP/FOP */ -#define X86_BUG_CLFLUSH_MONITOR X86_BUG(7) /* AAI65, CLFLUSH required before MONITOR */ -#define X86_BUG_SYSRET_SS_ATTRS X86_BUG(8) /* SYSRET doesn't fix up SS attrs */ -#define X86_BUG_CPU_MELTDOWN X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */ -#define X86_BUG_SPECTRE_V1 X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */ -#define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */ +#include <asm/processor.h>
#if defined(__KERNEL__) && !defined(__ASSEMBLY__)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h new file mode 100644 index 0000000..255ea74 --- /dev/null +++ b/arch/x86/include/asm/cpufeatures.h @@ -0,0 +1,297 @@ +#ifndef _ASM_X86_CPUFEATURES_H +#define _ASM_X86_CPUFEATURES_H + +#ifndef _ASM_X86_REQUIRED_FEATURES_H +#include <asm/required-features.h> +#endif + +#ifndef _ASM_X86_DISABLED_FEATURES_H +#include <asm/disabled-features.h> +#endif + +/* + * Defines x86 CPU feature bits + */ +#define NCAPINTS 16 /* N 32-bit words worth of info */ +#define NBUGINTS 1 /* N 32-bit bug flags */ + +/* + * Note: If the comment begins with a quoted string, that string is used + * in /proc/cpuinfo instead of the macro name. If the string is "", + * this feature bit is not displayed in /proc/cpuinfo at all. + */ + +/* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */ +#define X86_FEATURE_FPU ( 0*32+ 0) /* Onboard FPU */ +#define X86_FEATURE_VME ( 0*32+ 1) /* Virtual Mode Extensions */ +#define X86_FEATURE_DE ( 0*32+ 2) /* Debugging Extensions */ +#define X86_FEATURE_PSE ( 0*32+ 3) /* Page Size Extensions */ +#define X86_FEATURE_TSC ( 0*32+ 4) /* Time Stamp Counter */ +#define X86_FEATURE_MSR ( 0*32+ 5) /* Model-Specific Registers */ +#define X86_FEATURE_PAE ( 0*32+ 6) /* Physical Address Extensions */ +#define X86_FEATURE_MCE ( 0*32+ 7) /* Machine Check Exception */ +#define X86_FEATURE_CX8 ( 0*32+ 8) /* CMPXCHG8 instruction */ +#define X86_FEATURE_APIC ( 0*32+ 9) /* Onboard APIC */ +#define X86_FEATURE_SEP ( 0*32+11) /* SYSENTER/SYSEXIT */ +#define X86_FEATURE_MTRR ( 0*32+12) /* Memory Type Range Registers */ +#define X86_FEATURE_PGE ( 0*32+13) /* Page Global Enable */ +#define X86_FEATURE_MCA ( 0*32+14) /* Machine Check Architecture */ +#define X86_FEATURE_CMOV ( 0*32+15) /* CMOV instructions */ + /* (plus FCMOVcc, FCOMI with FPU) */ +#define X86_FEATURE_PAT ( 0*32+16) /* Page Attribute Table */ +#define X86_FEATURE_PSE36 ( 0*32+17) /* 36-bit PSEs */ +#define X86_FEATURE_PN ( 0*32+18) /* Processor serial number */ +#define X86_FEATURE_CLFLUSH ( 0*32+19) /* CLFLUSH instruction */ +#define X86_FEATURE_DS ( 0*32+21) /* "dts" Debug Store */ +#define X86_FEATURE_ACPI ( 0*32+22) /* ACPI via MSR */ +#define X86_FEATURE_MMX ( 0*32+23) /* Multimedia Extensions */ +#define X86_FEATURE_FXSR ( 0*32+24) /* FXSAVE/FXRSTOR, CR4.OSFXSR */ +#define X86_FEATURE_XMM ( 0*32+25) /* "sse" */ +#define X86_FEATURE_XMM2 ( 0*32+26) /* "sse2" */ +#define X86_FEATURE_SELFSNOOP ( 0*32+27) /* "ss" CPU self snoop */ +#define X86_FEATURE_HT ( 0*32+28) /* Hyper-Threading */ +#define X86_FEATURE_ACC ( 0*32+29) /* "tm" Automatic clock control */ +#define X86_FEATURE_IA64 ( 0*32+30) /* IA-64 processor */ +#define X86_FEATURE_PBE ( 0*32+31) /* Pending Break Enable */ + +/* AMD-defined CPU features, CPUID level 0x80000001, word 1 */ +/* Don't duplicate feature flags which are redundant with Intel! */ +#define X86_FEATURE_SYSCALL ( 1*32+11) /* SYSCALL/SYSRET */ +#define X86_FEATURE_MP ( 1*32+19) /* MP Capable. */ +#define X86_FEATURE_NX ( 1*32+20) /* Execute Disable */ +#define X86_FEATURE_MMXEXT ( 1*32+22) /* AMD MMX extensions */ +#define X86_FEATURE_FXSR_OPT ( 1*32+25) /* FXSAVE/FXRSTOR optimizations */ +#define X86_FEATURE_GBPAGES ( 1*32+26) /* "pdpe1gb" GB pages */ +#define X86_FEATURE_RDTSCP ( 1*32+27) /* RDTSCP */ +#define X86_FEATURE_LM ( 1*32+29) /* Long Mode (x86-64) */ +#define X86_FEATURE_3DNOWEXT ( 1*32+30) /* AMD 3DNow! extensions */ +#define X86_FEATURE_3DNOW ( 1*32+31) /* 3DNow! */ + +/* Transmeta-defined CPU features, CPUID level 0x80860001, word 2 */ +#define X86_FEATURE_RECOVERY ( 2*32+ 0) /* CPU in recovery mode */ +#define X86_FEATURE_LONGRUN ( 2*32+ 1) /* Longrun power control */ +#define X86_FEATURE_LRTI ( 2*32+ 3) /* LongRun table interface */ + +/* Other features, Linux-defined mapping, word 3 */ +/* This range is used for feature bits which conflict or are synthesized */ +#define X86_FEATURE_CXMMX ( 3*32+ 0) /* Cyrix MMX extensions */ +#define X86_FEATURE_K6_MTRR ( 3*32+ 1) /* AMD K6 nonstandard MTRRs */ +#define X86_FEATURE_CYRIX_ARR ( 3*32+ 2) /* Cyrix ARRs (= MTRRs) */ +#define X86_FEATURE_CENTAUR_MCR ( 3*32+ 3) /* Centaur MCRs (= MTRRs) */ +/* cpu types for specific tunings: */ +#define X86_FEATURE_K8 ( 3*32+ 4) /* "" Opteron, Athlon64 */ +#define X86_FEATURE_K7 ( 3*32+ 5) /* "" Athlon */ +#define X86_FEATURE_P3 ( 3*32+ 6) /* "" P3 */ +#define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */ +#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */ +#define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */ +/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE leaks FOP/FIP/FOP */ +#define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */ +#define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */ +#define X86_FEATURE_BTS ( 3*32+13) /* Branch Trace Store */ +#define X86_FEATURE_SYSCALL32 ( 3*32+14) /* "" syscall in ia32 userspace */ +#define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in ia32 userspace */ +#define X86_FEATURE_REP_GOOD ( 3*32+16) /* rep microcode works well */ +#define X86_FEATURE_MFENCE_RDTSC ( 3*32+17) /* "" Mfence synchronizes RDTSC */ +#define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" Lfence synchronizes RDTSC */ +/* free, was #define X86_FEATURE_11AP ( 3*32+19) * "" Bad local APIC aka 11AP */ +#define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */ +#define X86_FEATURE_ALWAYS ( 3*32+21) /* "" Always-present feature */ +#define X86_FEATURE_XTOPOLOGY ( 3*32+22) /* cpu topology enum extensions */ +#define X86_FEATURE_TSC_RELIABLE ( 3*32+23) /* TSC is known to be reliable */ +#define X86_FEATURE_NONSTOP_TSC ( 3*32+24) /* TSC does not stop in C states */ +/* free, was #define X86_FEATURE_CLFLUSH_MONITOR ( 3*32+25) * "" clflush reqd with monitor */ +#define X86_FEATURE_EXTD_APICID ( 3*32+26) /* has extended APICID (8 bits) */ +#define X86_FEATURE_AMD_DCM ( 3*32+27) /* multi-node processor */ +#define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */ +/* free, was #define X86_FEATURE_EAGER_FPU ( 3*32+29) * "eagerfpu" Non lazy FPU restore */ +#define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */ + +/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */ +#define X86_FEATURE_XMM3 ( 4*32+ 0) /* "pni" SSE-3 */ +#define X86_FEATURE_PCLMULQDQ ( 4*32+ 1) /* PCLMULQDQ instruction */ +#define X86_FEATURE_DTES64 ( 4*32+ 2) /* 64-bit Debug Store */ +#define X86_FEATURE_MWAIT ( 4*32+ 3) /* "monitor" Monitor/Mwait support */ +#define X86_FEATURE_DSCPL ( 4*32+ 4) /* "ds_cpl" CPL Qual. Debug Store */ +#define X86_FEATURE_VMX ( 4*32+ 5) /* Hardware virtualization */ +#define X86_FEATURE_SMX ( 4*32+ 6) /* Safer mode */ +#define X86_FEATURE_EST ( 4*32+ 7) /* Enhanced SpeedStep */ +#define X86_FEATURE_TM2 ( 4*32+ 8) /* Thermal Monitor 2 */ +#define X86_FEATURE_SSSE3 ( 4*32+ 9) /* Supplemental SSE-3 */ +#define X86_FEATURE_CID ( 4*32+10) /* Context ID */ +#define X86_FEATURE_SDBG ( 4*32+11) /* Silicon Debug */ +#define X86_FEATURE_FMA ( 4*32+12) /* Fused multiply-add */ +#define X86_FEATURE_CX16 ( 4*32+13) /* CMPXCHG16B */ +#define X86_FEATURE_XTPR ( 4*32+14) /* Send Task Priority Messages */ +#define X86_FEATURE_PDCM ( 4*32+15) /* Performance Capabilities */ +#define X86_FEATURE_PCID ( 4*32+17) /* Process Context Identifiers */ +#define X86_FEATURE_DCA ( 4*32+18) /* Direct Cache Access */ +#define X86_FEATURE_XMM4_1 ( 4*32+19) /* "sse4_1" SSE-4.1 */ +#define X86_FEATURE_XMM4_2 ( 4*32+20) /* "sse4_2" SSE-4.2 */ +#define X86_FEATURE_X2APIC ( 4*32+21) /* x2APIC */ +#define X86_FEATURE_MOVBE ( 4*32+22) /* MOVBE instruction */ +#define X86_FEATURE_POPCNT ( 4*32+23) /* POPCNT instruction */ +#define X86_FEATURE_TSC_DEADLINE_TIMER ( 4*32+24) /* Tsc deadline timer */ +#define X86_FEATURE_AES ( 4*32+25) /* AES instructions */ +#define X86_FEATURE_XSAVE ( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */ +#define X86_FEATURE_OSXSAVE ( 4*32+27) /* "" XSAVE enabled in the OS */ +#define X86_FEATURE_AVX ( 4*32+28) /* Advanced Vector Extensions */ +#define X86_FEATURE_F16C ( 4*32+29) /* 16-bit fp conversions */ +#define X86_FEATURE_RDRAND ( 4*32+30) /* The RDRAND instruction */ +#define X86_FEATURE_HYPERVISOR ( 4*32+31) /* Running on a hypervisor */ + +/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */ +#define X86_FEATURE_XSTORE ( 5*32+ 2) /* "rng" RNG present (xstore) */ +#define X86_FEATURE_XSTORE_EN ( 5*32+ 3) /* "rng_en" RNG enabled */ +#define X86_FEATURE_XCRYPT ( 5*32+ 6) /* "ace" on-CPU crypto (xcrypt) */ +#define X86_FEATURE_XCRYPT_EN ( 5*32+ 7) /* "ace_en" on-CPU crypto enabled */ +#define X86_FEATURE_ACE2 ( 5*32+ 8) /* Advanced Cryptography Engine v2 */ +#define X86_FEATURE_ACE2_EN ( 5*32+ 9) /* ACE v2 enabled */ +#define X86_FEATURE_PHE ( 5*32+10) /* PadLock Hash Engine */ +#define X86_FEATURE_PHE_EN ( 5*32+11) /* PHE enabled */ +#define X86_FEATURE_PMM ( 5*32+12) /* PadLock Montgomery Multiplier */ +#define X86_FEATURE_PMM_EN ( 5*32+13) /* PMM enabled */ + +/* More extended AMD flags: CPUID level 0x80000001, ecx, word 6 */ +#define X86_FEATURE_LAHF_LM ( 6*32+ 0) /* LAHF/SAHF in long mode */ +#define X86_FEATURE_CMP_LEGACY ( 6*32+ 1) /* If yes HyperThreading not valid */ +#define X86_FEATURE_SVM ( 6*32+ 2) /* Secure virtual machine */ +#define X86_FEATURE_EXTAPIC ( 6*32+ 3) /* Extended APIC space */ +#define X86_FEATURE_CR8_LEGACY ( 6*32+ 4) /* CR8 in 32-bit mode */ +#define X86_FEATURE_ABM ( 6*32+ 5) /* Advanced bit manipulation */ +#define X86_FEATURE_SSE4A ( 6*32+ 6) /* SSE-4A */ +#define X86_FEATURE_MISALIGNSSE ( 6*32+ 7) /* Misaligned SSE mode */ +#define X86_FEATURE_3DNOWPREFETCH ( 6*32+ 8) /* 3DNow prefetch instructions */ +#define X86_FEATURE_OSVW ( 6*32+ 9) /* OS Visible Workaround */ +#define X86_FEATURE_IBS ( 6*32+10) /* Instruction Based Sampling */ +#define X86_FEATURE_XOP ( 6*32+11) /* extended AVX instructions */ +#define X86_FEATURE_SKINIT ( 6*32+12) /* SKINIT/STGI instructions */ +#define X86_FEATURE_WDT ( 6*32+13) /* Watchdog timer */ +#define X86_FEATURE_LWP ( 6*32+15) /* Light Weight Profiling */ +#define X86_FEATURE_FMA4 ( 6*32+16) /* 4 operands MAC instructions */ +#define X86_FEATURE_TCE ( 6*32+17) /* translation cache extension */ +#define X86_FEATURE_NODEID_MSR ( 6*32+19) /* NodeId MSR */ +#define X86_FEATURE_TBM ( 6*32+21) /* trailing bit manipulations */ +#define X86_FEATURE_TOPOEXT ( 6*32+22) /* topology extensions CPUID leafs */ +#define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* core performance counter extensions */ +#define X86_FEATURE_PERFCTR_NB ( 6*32+24) /* NB performance counter extensions */ +#define X86_FEATURE_BPEXT (6*32+26) /* data breakpoint extension */ +#define X86_FEATURE_PERFCTR_L2 ( 6*32+28) /* L2 performance counter extensions */ +#define X86_FEATURE_MWAITX ( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) */ + +/* + * Auxiliary flags: Linux defined - For features scattered in various + * CPUID levels like 0x6, 0xA etc, word 7. + * + * Reuse free bits when adding new feature flags! + */ + +#define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */ +#define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */ +#define X86_FEATURE_INVPCID_SINGLE ( 7*32+ 4) /* Effectively INVPCID && CR4.PCIDE=1 */ + +#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */ +#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */ + +#define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */ +#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* Fill RSB on context switches */ + +#define X86_FEATURE_RETPOLINE ( 7*32+29) /* Generic Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE_AMD ( 7*32+30) /* AMD Retpoline mitigation for Spectre variant 2 */ +/* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */ +#define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */ + +/* Virtualization flags: Linux defined, word 8 */ +#define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ +#define X86_FEATURE_VNMI ( 8*32+ 1) /* Intel Virtual NMI */ +#define X86_FEATURE_FLEXPRIORITY ( 8*32+ 2) /* Intel FlexPriority */ +#define X86_FEATURE_EPT ( 8*32+ 3) /* Intel Extended Page Table */ +#define X86_FEATURE_VPID ( 8*32+ 4) /* Intel Virtual Processor ID */ + +#define X86_FEATURE_VMMCALL ( 8*32+15) /* Prefer vmmcall to vmcall */ +#define X86_FEATURE_XENPV ( 8*32+16) /* "" Xen paravirtual guest */ + + +/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */ +#define X86_FEATURE_FSGSBASE ( 9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/ +#define X86_FEATURE_TSC_ADJUST ( 9*32+ 1) /* TSC adjustment MSR 0x3b */ +#define X86_FEATURE_BMI1 ( 9*32+ 3) /* 1st group bit manipulation extensions */ +#define X86_FEATURE_HLE ( 9*32+ 4) /* Hardware Lock Elision */ +#define X86_FEATURE_AVX2 ( 9*32+ 5) /* AVX2 instructions */ +#define X86_FEATURE_SMEP ( 9*32+ 7) /* Supervisor Mode Execution Protection */ +#define X86_FEATURE_BMI2 ( 9*32+ 8) /* 2nd group bit manipulation extensions */ +#define X86_FEATURE_ERMS ( 9*32+ 9) /* Enhanced REP MOVSB/STOSB */ +#define X86_FEATURE_INVPCID ( 9*32+10) /* Invalidate Processor Context ID */ +#define X86_FEATURE_RTM ( 9*32+11) /* Restricted Transactional Memory */ +#define X86_FEATURE_CQM ( 9*32+12) /* Cache QoS Monitoring */ +#define X86_FEATURE_MPX ( 9*32+14) /* Memory Protection Extension */ +#define X86_FEATURE_AVX512F ( 9*32+16) /* AVX-512 Foundation */ +#define X86_FEATURE_RDSEED ( 9*32+18) /* The RDSEED instruction */ +#define X86_FEATURE_ADX ( 9*32+19) /* The ADCX and ADOX instructions */ +#define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ +#define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ +#define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ +#define X86_FEATURE_CLWB ( 9*32+24) /* CLWB instruction */ +#define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ +#define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ +#define X86_FEATURE_AVX512CD ( 9*32+28) /* AVX-512 Conflict Detection */ +#define X86_FEATURE_SHA_NI ( 9*32+29) /* SHA1/SHA256 Instruction Extensions */ + +/* Extended state features, CPUID level 0x0000000d:1 (eax), word 10 */ +#define X86_FEATURE_XSAVEOPT (10*32+ 0) /* XSAVEOPT */ +#define X86_FEATURE_XSAVEC (10*32+ 1) /* XSAVEC */ +#define X86_FEATURE_XGETBV1 (10*32+ 2) /* XGETBV with ECX = 1 */ +#define X86_FEATURE_XSAVES (10*32+ 3) /* XSAVES/XRSTORS */ + +/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (edx), word 11 */ +#define X86_FEATURE_CQM_LLC (11*32+ 1) /* LLC QoS if 1 */ + +/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (edx), word 12 */ +#define X86_FEATURE_CQM_OCCUP_LLC (12*32+ 0) /* LLC occupancy monitoring if 1 */ + +/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */ +#define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */ + +/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */ +#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */ +#define X86_FEATURE_IDA (14*32+ 1) /* Intel Dynamic Acceleration */ +#define X86_FEATURE_ARAT (14*32+ 2) /* Always Running APIC Timer */ +#define X86_FEATURE_PLN (14*32+ 4) /* Intel Power Limit Notification */ +#define X86_FEATURE_PTS (14*32+ 6) /* Intel Package Thermal Status */ +#define X86_FEATURE_HWP (14*32+ 7) /* Intel Hardware P-states */ +#define X86_FEATURE_HWP_NOTIFY (14*32+ 8) /* HWP Notification */ +#define X86_FEATURE_HWP_ACT_WINDOW (14*32+ 9) /* HWP Activity Window */ +#define X86_FEATURE_HWP_EPP (14*32+10) /* HWP Energy Perf. Preference */ +#define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */ + +/* AMD SVM Feature Identification, CPUID level 0x8000000a (edx), word 15 */ +#define X86_FEATURE_NPT (15*32+ 0) /* Nested Page Table support */ +#define X86_FEATURE_LBRV (15*32+ 1) /* LBR Virtualization support */ +#define X86_FEATURE_SVML (15*32+ 2) /* "svm_lock" SVM locking MSR */ +#define X86_FEATURE_NRIPS (15*32+ 3) /* "nrip_save" SVM next_rip save */ +#define X86_FEATURE_TSCRATEMSR (15*32+ 4) /* "tsc_scale" TSC scaling support */ +#define X86_FEATURE_VMCBCLEAN (15*32+ 5) /* "vmcb_clean" VMCB clean bits support */ +#define X86_FEATURE_FLUSHBYASID (15*32+ 6) /* flush-by-ASID support */ +#define X86_FEATURE_DECODEASSISTS (15*32+ 7) /* Decode Assists support */ +#define X86_FEATURE_PAUSEFILTER (15*32+10) /* filtered pause intercept */ +#define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */ + +/* + * BUG word(s) + */ +#define X86_BUG(x) (NCAPINTS*32 + (x)) + +#define X86_BUG_F00F X86_BUG(0) /* Intel F00F */ +#define X86_BUG_FDIV X86_BUG(1) /* FPU FDIV */ +#define X86_BUG_COMA X86_BUG(2) /* Cyrix 6x86 coma */ +#define X86_BUG_AMD_TLB_MMATCH X86_BUG(3) /* "tlb_mmatch" AMD Erratum 383 */ +#define X86_BUG_AMD_APIC_C1E X86_BUG(4) /* "apic_c1e" AMD Erratum 400 */ +#define X86_BUG_11AP X86_BUG(5) /* Bad local APIC aka 11AP */ +#define X86_BUG_FXSAVE_LEAK X86_BUG(6) /* FXSAVE leaks FOP/FIP/FOP */ +#define X86_BUG_CLFLUSH_MONITOR X86_BUG(7) /* AAI65, CLFLUSH required before MONITOR */ +#define X86_BUG_SYSRET_SS_ATTRS X86_BUG(8) /* SYSRET doesn't fix up SS attrs */ +#define X86_BUG_CPU_MELTDOWN X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */ +#define X86_BUG_SPECTRE_V1 X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */ +#define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */ + +#endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 60c984a..94e7570 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -17,6 +17,7 @@ #include <asm/user.h> #include <asm/fpu/api.h> #include <asm/fpu/xstate.h> +#include <asm/cpufeature.h>
/* * High level FPU state handling functions: diff --git a/arch/x86/include/asm/irq_work.h b/arch/x86/include/asm/irq_work.h index 78162f8..d0afb05 100644 --- a/arch/x86/include/asm/irq_work.h +++ b/arch/x86/include/asm/irq_work.h @@ -1,7 +1,7 @@ #ifndef _ASM_IRQ_WORK_H #define _ASM_IRQ_WORK_H
-#include <asm/processor.h> +#include <asm/cpufeature.h>
static inline bool arch_irq_work_has_interrupt(void) { diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h index c70689b..0deeb2d 100644 --- a/arch/x86/include/asm/mwait.h +++ b/arch/x86/include/asm/mwait.h @@ -3,6 +3,8 @@
#include <linux/sched.h>
+#include <asm/cpufeature.h> + #define MWAIT_SUBSTATE_MASK 0xf #define MWAIT_CSTATE_MASK 0xf #define MWAIT_SUBSTATE_SIZE 4 diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 249f1c7..8b91041 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -5,7 +5,7 @@
#include <asm/alternative.h> #include <asm/alternative-asm.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h>
/* * Fill the CPU return stack buffer. diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 9e77cea..8e415cf 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -13,7 +13,7 @@ struct vm86; #include <asm/types.h> #include <uapi/asm/sigcontext.h> #include <asm/current.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/page.h> #include <asm/pgtable_types.h> #include <asm/percpu.h> @@ -24,7 +24,6 @@ struct vm86; #include <asm/fpu/types.h>
#include <linux/personality.h> -#include <linux/cpumask.h> #include <linux/cache.h> #include <linux/threads.h> #include <linux/math64.h> diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h index ba665eb..db33330 100644 --- a/arch/x86/include/asm/smap.h +++ b/arch/x86/include/asm/smap.h @@ -15,7 +15,7 @@
#include <linux/stringify.h> #include <asm/nops.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h>
/* "Raw" instruction opcodes */ #define __ASM_CLAC .byte 0x0f,0x01,0xca diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h index a438c55..04d6eef 100644 --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -16,7 +16,6 @@ #endif #include <asm/thread_info.h> #include <asm/cpumask.h> -#include <asm/cpufeature.h>
extern int smp_num_siblings; extern unsigned int num_processors; diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 9b02820..18c9aaa 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -49,7 +49,7 @@ */ #ifndef __ASSEMBLY__ struct task_struct; -#include <asm/processor.h> +#include <asm/cpufeature.h> #include <linux/atomic.h>
struct thread_info { diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index a691b66..e2a89d2 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -5,6 +5,7 @@ #include <linux/sched.h>
#include <asm/processor.h> +#include <asm/cpufeature.h> #include <asm/special_insns.h> #include <asm/smp.h>
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index f2f9b39..d83a55b 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -8,7 +8,7 @@ #include <linux/errno.h> #include <linux/lockdep.h> #include <asm/alternative.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/page.h>
/* diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index 8f18461..924b657 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -62,7 +62,7 @@ ifdef CONFIG_X86_FEATURE_NAMES quiet_cmd_mkcapflags = MKCAP $@ cmd_mkcapflags = $(CONFIG_SHELL) $(srctree)/$(src)/mkcapflags.sh $< $@
-cpufeature = $(src)/../../include/asm/cpufeature.h +cpufeature = $(src)/../../include/asm/cpufeatures.h
targets += capflags.c $(obj)/capflags.c: $(cpufeature) $(src)/mkcapflags.sh FORCE diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c index ae20be6..6608c03 100644 --- a/arch/x86/kernel/cpu/centaur.c +++ b/arch/x86/kernel/cpu/centaur.c @@ -1,7 +1,7 @@ #include <linux/bitops.h> #include <linux/kernel.h>
-#include <asm/processor.h> +#include <asm/cpufeature.h> #include <asm/e820.h> #include <asm/mtrr.h> #include <asm/msr.h> diff --git a/arch/x86/kernel/cpu/cyrix.c b/arch/x86/kernel/cpu/cyrix.c index aaf152e..15e47c1 100644 --- a/arch/x86/kernel/cpu/cyrix.c +++ b/arch/x86/kernel/cpu/cyrix.c @@ -8,6 +8,7 @@ #include <linux/timer.h> #include <asm/pci-direct.h> #include <asm/tsc.h> +#include <asm/cpufeature.h>
#include "cpu.h"
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 565648b..9299e3b 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -8,7 +8,7 @@ #include <linux/module.h> #include <linux/uaccess.h>
-#include <asm/processor.h> +#include <asm/cpufeature.h> #include <asm/pgtable.h> #include <asm/msr.h> #include <asm/bugs.h> diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c index 3fa7231..3557b3c 100644 --- a/arch/x86/kernel/cpu/intel_cacheinfo.c +++ b/arch/x86/kernel/cpu/intel_cacheinfo.c @@ -14,7 +14,7 @@ #include <linux/sysfs.h> #include <linux/pci.h>
-#include <asm/processor.h> +#include <asm/cpufeature.h> #include <asm/amd_nb.h> #include <asm/smp.h>
diff --git a/arch/x86/kernel/cpu/match.c b/arch/x86/kernel/cpu/match.c index afa9f0d..fbb5e90 100644 --- a/arch/x86/kernel/cpu/match.c +++ b/arch/x86/kernel/cpu/match.c @@ -1,5 +1,5 @@ #include <asm/cpu_device_id.h> -#include <asm/processor.h> +#include <asm/cpufeature.h> #include <linux/cpu.h> #include <linux/module.h> #include <linux/slab.h> diff --git a/arch/x86/kernel/cpu/mkcapflags.sh b/arch/x86/kernel/cpu/mkcapflags.sh index 3f20710..6988c74 100644 --- a/arch/x86/kernel/cpu/mkcapflags.sh +++ b/arch/x86/kernel/cpu/mkcapflags.sh @@ -1,6 +1,6 @@ #!/bin/sh # -# Generate the x86_cap/bug_flags[] arrays from include/asm/cpufeature.h +# Generate the x86_cap/bug_flags[] arrays from include/asm/cpufeatures.h #
IN=$1 @@ -49,8 +49,8 @@ dump_array() trap 'rm "$OUT"' EXIT
( - echo "#ifndef _ASM_X86_CPUFEATURE_H" - echo "#include <asm/cpufeature.h>" + echo "#ifndef _ASM_X86_CPUFEATURES_H" + echo "#include <asm/cpufeatures.h>" echo "#endif" echo ""
diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c index f924f41..49bd700 100644 --- a/arch/x86/kernel/cpu/mtrr/main.c +++ b/arch/x86/kernel/cpu/mtrr/main.c @@ -47,7 +47,7 @@ #include <linux/smp.h> #include <linux/syscore_ops.h>
-#include <asm/processor.h> +#include <asm/cpufeature.h> #include <asm/e820.h> #include <asm/mtrr.h> #include <asm/msr.h> diff --git a/arch/x86/kernel/cpu/transmeta.c b/arch/x86/kernel/cpu/transmeta.c index 252da7a..a19a663 100644 --- a/arch/x86/kernel/cpu/transmeta.c +++ b/arch/x86/kernel/cpu/transmeta.c @@ -1,6 +1,6 @@ #include <linux/kernel.h> #include <linux/mm.h> -#include <asm/processor.h> +#include <asm/cpufeature.h> #include <asm/msr.h> #include "cpu.h"
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 52a2526..19bc19d 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -24,6 +24,7 @@ #include <asm/e820.h> #include <asm/proto.h> #include <asm/setup.h> +#include <asm/cpufeature.h>
/* * The e820 map is the map that gets modified e.g. with command line parameters diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S index 70284d3..1c0b49f 100644 --- a/arch/x86/kernel/head_32.S +++ b/arch/x86/kernel/head_32.S @@ -19,7 +19,7 @@ #include <asm/setup.h> #include <asm/processor-flags.h> #include <asm/msr-index.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/percpu.h> #include <asm/nops.h> #include <asm/bootparam.h> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index f48eb8e..3fdc1e5 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -12,6 +12,7 @@ #include <linux/pm.h> #include <linux/io.h>
+#include <asm/cpufeature.h> #include <asm/irqdomain.h> #include <asm/fixmap.h> #include <asm/hpet.h> diff --git a/arch/x86/kernel/msr.c b/arch/x86/kernel/msr.c index 113e707..f95ac5d 100644 --- a/arch/x86/kernel/msr.c +++ b/arch/x86/kernel/msr.c @@ -40,7 +40,7 @@ #include <linux/uaccess.h> #include <linux/gfp.h>
-#include <asm/processor.h> +#include <asm/cpufeature.h> #include <asm/msr.h>
static struct class *msr_class; diff --git a/arch/x86/kernel/verify_cpu.S b/arch/x86/kernel/verify_cpu.S index 4cf401f..b7c9db5 100644 --- a/arch/x86/kernel/verify_cpu.S +++ b/arch/x86/kernel/verify_cpu.S @@ -30,7 +30,7 @@ * appropriately. Either display a message or halt. */
-#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/msr-index.h>
verify_cpu: diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S index a2fe51b..65be7cf 100644 --- a/arch/x86/lib/clear_page_64.S +++ b/arch/x86/lib/clear_page_64.S @@ -1,5 +1,5 @@ #include <linux/linkage.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/alternative-asm.h>
/* diff --git a/arch/x86/lib/copy_page_64.S b/arch/x86/lib/copy_page_64.S index 009f982..24ef1c2 100644 --- a/arch/x86/lib/copy_page_64.S +++ b/arch/x86/lib/copy_page_64.S @@ -1,7 +1,7 @@ /* Written 2003 by Andi Kleen, based on a kernel by Evandro Menezes */
#include <linux/linkage.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/alternative-asm.h>
/* diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index 423644c..accf7f2 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -10,7 +10,7 @@ #include <asm/current.h> #include <asm/asm-offsets.h> #include <asm/thread_info.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/alternative-asm.h> #include <asm/asm.h> #include <asm/smap.h> diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S index 16698bb..a0de849 100644 --- a/arch/x86/lib/memcpy_64.S +++ b/arch/x86/lib/memcpy_64.S @@ -1,7 +1,7 @@ /* Copyright 2002 Andi Kleen */
#include <linux/linkage.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/alternative-asm.h>
/* diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S index ca2afdd..90ce01b 100644 --- a/arch/x86/lib/memmove_64.S +++ b/arch/x86/lib/memmove_64.S @@ -6,7 +6,7 @@ * - Copyright 2011 Fenghua Yu fenghua.yu@intel.com */ #include <linux/linkage.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/alternative-asm.h>
#undef memmove diff --git a/arch/x86/lib/memset_64.S b/arch/x86/lib/memset_64.S index 2661fad..c9c8122 100644 --- a/arch/x86/lib/memset_64.S +++ b/arch/x86/lib/memset_64.S @@ -1,7 +1,7 @@ /* Copyright 2002 Andi Kleen, SuSE Labs */
#include <linux/linkage.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/alternative-asm.h>
.weak memset diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S index 3d06b48..7bbb853 100644 --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -3,7 +3,7 @@ #include <linux/stringify.h> #include <linux/linkage.h> #include <asm/dwarf2.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/alternative-asm.h> #include <asm-generic/export.h> #include <asm/nospec-branch.h> diff --git a/arch/x86/mm/setup_nx.c b/arch/x86/mm/setup_nx.c index 92e2eac..f65a33f 100644 --- a/arch/x86/mm/setup_nx.c +++ b/arch/x86/mm/setup_nx.c @@ -4,6 +4,7 @@
#include <asm/pgtable.h> #include <asm/proto.h> +#include <asm/cpufeature.h>
static int disable_nx;
diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c index 50d86c0..660a83c 100644 --- a/arch/x86/oprofile/op_model_amd.c +++ b/arch/x86/oprofile/op_model_amd.c @@ -24,7 +24,6 @@ #include <asm/nmi.h> #include <asm/apic.h> #include <asm/processor.h> -#include <asm/cpufeature.h>
#include "op_x86_model.h" #include "op_counter.h" diff --git a/arch/x86/um/asm/barrier.h b/arch/x86/um/asm/barrier.h index 755481f..764ac2f 100644 --- a/arch/x86/um/asm/barrier.h +++ b/arch/x86/um/asm/barrier.h @@ -3,7 +3,7 @@
#include <asm/asm.h> #include <asm/segment.h> -#include <asm/cpufeature.h> +#include <asm/cpufeatures.h> #include <asm/cmpxchg.h> #include <asm/nops.h>
diff --git a/lib/atomic64_test.c b/lib/atomic64_test.c index d51e25a..de67fea 100644 --- a/lib/atomic64_test.c +++ b/lib/atomic64_test.c @@ -17,7 +17,7 @@ #include <linux/atomic.h>
#ifdef CONFIG_X86 -#include <asm/processor.h> /* for boot_cpu_has below */ +#include <asm/cpufeature.h> /* for boot_cpu_has below */ #endif
#define TEST(bit, op, c_op, val) \
From: Borislav Petkov bp@suse.de
commit bc696ca05f5a8927329ec276a892341e006b00ba upstream
So the old one didn't work properly before alternatives had run. And it was supposed to provide an optimized JMP because the assumption was that the offset it is jumping to is within a signed byte and thus a two-byte JMP.
So I did an x86_64 allyesconfig build and dumped all possible sites where static_cpu_has() was used. The optimization amounted to all in all 12(!) places where static_cpu_has() had generated a 2-byte JMP. Which has saved us a whopping 36 bytes!
This clearly is not worth the trouble so we can remove it. The only place where the optimization might count - in __switch_to() - we will handle differently. But that's not subject of this patch.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1453842730-28463-6-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/Kconfig.debug | 10 --- arch/x86/include/asm/cpufeature.h | 100 ++-------------------------------- arch/x86/include/asm/fpu/internal.h | 12 ++-- arch/x86/kernel/apic/apic_numachip.c | 4 + arch/x86/kernel/cpu/common.c | 12 +--- arch/x86/kernel/vm86_32.c | 2 - fs/btrfs/disk-io.c | 2 - 7 files changed, 19 insertions(+), 123 deletions(-)
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug index da00fe1..2aa212f 100644 --- a/arch/x86/Kconfig.debug +++ b/arch/x86/Kconfig.debug @@ -367,16 +367,6 @@ config DEBUG_IMR_SELFTEST
If unsure say N here.
-config X86_DEBUG_STATIC_CPU_HAS - bool "Debug alternatives" - depends on DEBUG_KERNEL - ---help--- - This option causes additional code to be generated which - fails if static_cpu_has() is used before alternatives have - run. - - If unsure, say N. - config X86_DEBUG_FPU bool "Debug the x86 FPU code" depends on DEBUG_KERNEL diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index f62e872..b60598c 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -127,103 +127,19 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; #define cpu_has_osxsave boot_cpu_has(X86_FEATURE_OSXSAVE) #define cpu_has_hypervisor boot_cpu_has(X86_FEATURE_HYPERVISOR) /* - * Do not add any more of those clumsy macros - use static_cpu_has_safe() for + * Do not add any more of those clumsy macros - use static_cpu_has() for * fast paths and boot_cpu_has() otherwise! */
#if __GNUC__ >= 4 && defined(CONFIG_X86_FAST_FEATURE_TESTS) -extern void warn_pre_alternatives(void); -extern bool __static_cpu_has_safe(u16 bit); +extern bool __static_cpu_has(u16 bit);
/* * Static testing of CPU features. Used the same as boot_cpu_has(). * These are only valid after alternatives have run, but will statically * patch the target code for additional performance. */ -static __always_inline __pure bool __static_cpu_has(u16 bit) -{ -#ifdef CC_HAVE_ASM_GOTO - -#ifdef CONFIG_X86_DEBUG_STATIC_CPU_HAS - - /* - * Catch too early usage of this before alternatives - * have run. - */ - asm_volatile_goto("1: jmp %l[t_warn]\n" - "2:\n" - ".section .altinstructions,"a"\n" - " .long 1b - .\n" - " .long 0\n" /* no replacement */ - " .word %P0\n" /* 1: do replace */ - " .byte 2b - 1b\n" /* source len */ - " .byte 0\n" /* replacement len */ - " .byte 0\n" /* pad len */ - ".previous\n" - /* skipping size check since replacement size = 0 */ - : : "i" (X86_FEATURE_ALWAYS) : : t_warn); - -#endif - - asm_volatile_goto("1: jmp %l[t_no]\n" - "2:\n" - ".section .altinstructions,"a"\n" - " .long 1b - .\n" - " .long 0\n" /* no replacement */ - " .word %P0\n" /* feature bit */ - " .byte 2b - 1b\n" /* source len */ - " .byte 0\n" /* replacement len */ - " .byte 0\n" /* pad len */ - ".previous\n" - /* skipping size check since replacement size = 0 */ - : : "i" (bit) : : t_no); - return true; - t_no: - return false; - -#ifdef CONFIG_X86_DEBUG_STATIC_CPU_HAS - t_warn: - warn_pre_alternatives(); - return false; -#endif - -#else /* CC_HAVE_ASM_GOTO */ - - u8 flag; - /* Open-coded due to __stringify() in ALTERNATIVE() */ - asm volatile("1: movb $0,%0\n" - "2:\n" - ".section .altinstructions,"a"\n" - " .long 1b - .\n" - " .long 3f - .\n" - " .word %P1\n" /* feature bit */ - " .byte 2b - 1b\n" /* source len */ - " .byte 4f - 3f\n" /* replacement len */ - " .byte 0\n" /* pad len */ - ".previous\n" - ".section .discard,"aw",@progbits\n" - " .byte 0xff + (4f-3f) - (2b-1b)\n" /* size check */ - ".previous\n" - ".section .altinstr_replacement,"ax"\n" - "3: movb $1,%0\n" - "4:\n" - ".previous\n" - : "=qm" (flag) : "i" (bit)); - return flag; - -#endif /* CC_HAVE_ASM_GOTO */ -} - -#define static_cpu_has(bit) \ -( \ - __builtin_constant_p(boot_cpu_has(bit)) ? \ - boot_cpu_has(bit) : \ - __builtin_constant_p(bit) ? \ - __static_cpu_has(bit) : \ - boot_cpu_has(bit) \ -) - -static __always_inline __pure bool _static_cpu_has_safe(u16 bit) +static __always_inline __pure bool _static_cpu_has(u16 bit) { #ifdef CC_HAVE_ASM_GOTO asm_volatile_goto("1: jmp %l[t_dynamic]\n" @@ -257,7 +173,7 @@ static __always_inline __pure bool _static_cpu_has_safe(u16 bit) t_no: return false; t_dynamic: - return __static_cpu_has_safe(bit); + return __static_cpu_has(bit); #else u8 flag; /* Open-coded due to __stringify() in ALTERNATIVE() */ @@ -295,22 +211,21 @@ static __always_inline __pure bool _static_cpu_has_safe(u16 bit) ".previous\n" : "=qm" (flag) : "i" (bit), "i" (X86_FEATURE_ALWAYS)); - return (flag == 2 ? __static_cpu_has_safe(bit) : flag); + return (flag == 2 ? __static_cpu_has(bit) : flag); #endif /* CC_HAVE_ASM_GOTO */ }
-#define static_cpu_has_safe(bit) \ +#define static_cpu_has(bit) \ ( \ __builtin_constant_p(boot_cpu_has(bit)) ? \ boot_cpu_has(bit) : \ - _static_cpu_has_safe(bit) \ + _static_cpu_has(bit) \ ) #else /* * gcc 3.x is too stupid to do the static test; fall back to dynamic. */ #define static_cpu_has(bit) boot_cpu_has(bit) -#define static_cpu_has_safe(bit) boot_cpu_has(bit) #endif
#define cpu_has_bug(c, bit) cpu_has(c, (bit)) @@ -318,7 +233,6 @@ static __always_inline __pure bool _static_cpu_has_safe(u16 bit) #define clear_cpu_bug(c, bit) clear_cpu_cap(c, (bit))
#define static_cpu_has_bug(bit) static_cpu_has((bit)) -#define static_cpu_has_bug_safe(bit) static_cpu_has_safe((bit)) #define boot_cpu_has_bug(bit) cpu_has_bug(&boot_cpu_data, (bit))
#define MAX_CPU_FEATURES (NCAPINTS * 32) diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 94e7570..ec2aedb 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -64,17 +64,17 @@ static __always_inline __pure bool use_eager_fpu(void)
static __always_inline __pure bool use_xsaveopt(void) { - return static_cpu_has_safe(X86_FEATURE_XSAVEOPT); + return static_cpu_has(X86_FEATURE_XSAVEOPT); }
static __always_inline __pure bool use_xsave(void) { - return static_cpu_has_safe(X86_FEATURE_XSAVE); + return static_cpu_has(X86_FEATURE_XSAVE); }
static __always_inline __pure bool use_fxsr(void) { - return static_cpu_has_safe(X86_FEATURE_FXSR); + return static_cpu_has(X86_FEATURE_FXSR); }
/* @@ -301,7 +301,7 @@ static inline void copy_xregs_to_kernel_booting(struct xregs_state *xstate)
WARN_ON(system_state != SYSTEM_BOOTING);
- if (static_cpu_has_safe(X86_FEATURE_XSAVES)) + if (static_cpu_has(X86_FEATURE_XSAVES)) XSTATE_OP(XSAVES, xstate, lmask, hmask, err); else XSTATE_OP(XSAVE, xstate, lmask, hmask, err); @@ -323,7 +323,7 @@ static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate)
WARN_ON(system_state != SYSTEM_BOOTING);
- if (static_cpu_has_safe(X86_FEATURE_XSAVES)) + if (static_cpu_has(X86_FEATURE_XSAVES)) XSTATE_OP(XRSTORS, xstate, lmask, hmask, err); else XSTATE_OP(XRSTOR, xstate, lmask, hmask, err); @@ -461,7 +461,7 @@ static inline void copy_kernel_to_fpregs(union fpregs_state *fpstate) * pending. Clear the x87 state here by setting it to fixed values. * "m" is a random variable that should be in L1. */ - if (unlikely(static_cpu_has_bug_safe(X86_BUG_FXSAVE_LEAK))) { + if (unlikely(static_cpu_has_bug(X86_BUG_FXSAVE_LEAK))) { asm volatile( "fnclex\n\t" "emms\n\t" diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c index 2bd2292..bac0805 100644 --- a/arch/x86/kernel/apic/apic_numachip.c +++ b/arch/x86/kernel/apic/apic_numachip.c @@ -30,7 +30,7 @@ static unsigned int numachip1_get_apic_id(unsigned long x) unsigned long value; unsigned int id = (x >> 24) & 0xff;
- if (static_cpu_has_safe(X86_FEATURE_NODEID_MSR)) { + if (static_cpu_has(X86_FEATURE_NODEID_MSR)) { rdmsrl(MSR_FAM10H_NODE_ID, value); id |= (value << 2) & 0xff00; } @@ -178,7 +178,7 @@ static void fixup_cpu_id(struct cpuinfo_x86 *c, int node) this_cpu_write(cpu_llc_id, node);
/* Account for nodes per socket in multi-core-module processors */ - if (static_cpu_has_safe(X86_FEATURE_NODEID_MSR)) { + if (static_cpu_has(X86_FEATURE_NODEID_MSR)) { rdmsrl(MSR_FAM10H_NODE_ID, val); nodes = ((val >> 3) & 7) + 1; } diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 5b6e43b..f31b26b 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1576,19 +1576,11 @@ void cpu_init(void) } #endif
-#ifdef CONFIG_X86_DEBUG_STATIC_CPU_HAS -void warn_pre_alternatives(void) -{ - WARN(1, "You're using static_cpu_has before alternatives have run!\n"); -} -EXPORT_SYMBOL_GPL(warn_pre_alternatives); -#endif - -inline bool __static_cpu_has_safe(u16 bit) +inline bool __static_cpu_has(u16 bit) { return boot_cpu_has(bit); } -EXPORT_SYMBOL_GPL(__static_cpu_has_safe); +EXPORT_SYMBOL_GPL(__static_cpu_has);
static void bsp_resume(void) { diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c index d6d64a5..7f4839e 100644 --- a/arch/x86/kernel/vm86_32.c +++ b/arch/x86/kernel/vm86_32.c @@ -358,7 +358,7 @@ static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus) /* make room for real-mode segments */ tsk->thread.sp0 += 16;
- if (static_cpu_has_safe(X86_FEATURE_SEP)) + if (static_cpu_has(X86_FEATURE_SEP)) tsk->thread.sysenter_cs = 0;
load_sp0(tss, &tsk->thread); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 7efd70b..d106b98 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -923,7 +923,7 @@ static int check_async_write(struct inode *inode, unsigned long bio_flags) if (bio_flags & EXTENT_BIO_TREE_LOG) return 0; #ifdef CONFIG_X86 - if (static_cpu_has_safe(X86_FEATURE_XMM4_2)) + if (static_cpu_has(X86_FEATURE_XMM4_2)) return 0; #endif return 1;
From: Borislav Petkov bp@alien8.de
commit a362bf9f5e7dd659b96d01382da7b855f4e5a7a1 upstream
I can simply quote hpa from the mail:
"Get rid of the non-asm goto variant and just fall back to dynamic if asm goto is unavailable. It doesn't make any sense, really, if it is supposed to be safe, and by now the asm goto-capable gcc is in more wide use. (Originally the gcc 3.x fallback to pure dynamic didn't exist, either.)"
Booy, am I lazy.
Cleanup the whole CC_HAVE_ASM_GOTO ifdeffery too, while at it.
Suggested-by: H. Peter Anvin hpa@zytor.com Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20160127084325.GB30712@pd.tnic Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 49 ++++--------------------------------- 1 file changed, 5 insertions(+), 44 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index b60598c..53de461 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -131,17 +131,16 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; * fast paths and boot_cpu_has() otherwise! */
-#if __GNUC__ >= 4 && defined(CONFIG_X86_FAST_FEATURE_TESTS) +#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_X86_FAST_FEATURE_TESTS) extern bool __static_cpu_has(u16 bit);
/* * Static testing of CPU features. Used the same as boot_cpu_has(). - * These are only valid after alternatives have run, but will statically - * patch the target code for additional performance. + * These will statically patch the target code for additional + * performance. */ static __always_inline __pure bool _static_cpu_has(u16 bit) { -#ifdef CC_HAVE_ASM_GOTO asm_volatile_goto("1: jmp %l[t_dynamic]\n" "2:\n" ".skip -(((5f-4f) - (2b-1b)) > 0) * " @@ -174,45 +173,6 @@ static __always_inline __pure bool _static_cpu_has(u16 bit) return false; t_dynamic: return __static_cpu_has(bit); -#else - u8 flag; - /* Open-coded due to __stringify() in ALTERNATIVE() */ - asm volatile("1: movb $2,%0\n" - "2:\n" - ".section .altinstructions,"a"\n" - " .long 1b - .\n" /* src offset */ - " .long 3f - .\n" /* repl offset */ - " .word %P2\n" /* always replace */ - " .byte 2b - 1b\n" /* source len */ - " .byte 4f - 3f\n" /* replacement len */ - " .byte 0\n" /* pad len */ - ".previous\n" - ".section .discard,"aw",@progbits\n" - " .byte 0xff + (4f-3f) - (2b-1b)\n" /* size check */ - ".previous\n" - ".section .altinstr_replacement,"ax"\n" - "3: movb $0,%0\n" - "4:\n" - ".previous\n" - ".section .altinstructions,"a"\n" - " .long 1b - .\n" /* src offset */ - " .long 5f - .\n" /* repl offset */ - " .word %P1\n" /* feature bit */ - " .byte 4b - 3b\n" /* src len */ - " .byte 6f - 5f\n" /* repl len */ - " .byte 0\n" /* pad len */ - ".previous\n" - ".section .discard,"aw",@progbits\n" - " .byte 0xff + (6f-5f) - (4b-3b)\n" /* size check */ - ".previous\n" - ".section .altinstr_replacement,"ax"\n" - "5: movb $1,%0\n" - "6:\n" - ".previous\n" - : "=qm" (flag) - : "i" (bit), "i" (X86_FEATURE_ALWAYS)); - return (flag == 2 ? __static_cpu_has(bit) : flag); -#endif /* CC_HAVE_ASM_GOTO */ }
#define static_cpu_has(bit) \ @@ -223,7 +183,8 @@ static __always_inline __pure bool _static_cpu_has(u16 bit) ) #else /* - * gcc 3.x is too stupid to do the static test; fall back to dynamic. + * Fall back to dynamic for gcc versions which don't support asm goto. Should be + * a minority now anyway. */ #define static_cpu_has(bit) boot_cpu_has(bit) #endif
From: Borislav Petkov bp@suse.de
commit 337e4cc84021212a87b04b77b65cccc49304909e upstream
Add .altinstr_aux for additional instructions which will be used before and/or during patching. All stuff which needs more sophisticated patching should go there. See next patch.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1453842730-28463-8-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/vmlinux.lds.S | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index e065065..a703842 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -202,6 +202,17 @@ SECTIONS :init #endif
+ /* + * Section for code used exclusively before alternatives are run. All + * references to such code must be patched out by alternatives, normally + * by using X86_FEATURE_ALWAYS CPU feature bit. + * + * See static_cpu_has() for an example. + */ + .altinstr_aux : AT(ADDR(.altinstr_aux) - LOAD_OFFSET) { + *(.altinstr_aux) + } + INIT_DATA_SECTION(16)
.x86_cpu_dev.init : AT(ADDR(.x86_cpu_dev.init) - LOAD_OFFSET) {
From: Brian Gerst brgerst@gmail.com
commit 2476f2fa20568bd5d9e09cd35bcd73e99a6f4cc6 upstream
Move the code to do the dynamic check to the altinstr_aux section so that it is discarded after alternatives have run and a static branch has been chosen.
This way we're changing the dynamic branch from C code to assembly, which makes it *substantially* smaller while avoiding a completely unnecessary call to an out of line function.
Signed-off-by: Brian Gerst brgerst@gmail.com [ Changed it to do TESTB, as hpa suggested. ] Signed-off-by: Borislav Petkov bp@suse.de Cc: Andrew Morton akpm@linux-foundation.org Cc: Andy Lutomirski luto@amacapital.net Cc: Andy Lutomirski luto@kernel.org Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Borislav Petkov bp@alien8.de Cc: Dave Young dyoung@redhat.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Kristen Carlson Accardi kristen@linux.intel.com Cc: Laura Abbott labbott@fedoraproject.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra (Intel) peterz@infradead.org Cc: Peter Zijlstra peterz@infradead.org Cc: Prarit Bhargava prarit@redhat.com Cc: Ross Zwisler ross.zwisler@linux.intel.com Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1452972124-7380-1-git-send-email-brgerst@gmail.com Link: http://lkml.kernel.org/r/20160127084525.GC30712@pd.tnic Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 19 ++++++++++++------- arch/x86/kernel/cpu/common.c | 6 ------ 2 files changed, 12 insertions(+), 13 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 53de461..1c9d6c5 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -132,8 +132,6 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; */
#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_X86_FAST_FEATURE_TESTS) -extern bool __static_cpu_has(u16 bit); - /* * Static testing of CPU features. Used the same as boot_cpu_has(). * These will statically patch the target code for additional @@ -141,7 +139,7 @@ extern bool __static_cpu_has(u16 bit); */ static __always_inline __pure bool _static_cpu_has(u16 bit) { - asm_volatile_goto("1: jmp %l[t_dynamic]\n" + asm_volatile_goto("1: jmp 6f\n" "2:\n" ".skip -(((5f-4f) - (2b-1b)) > 0) * " "((5f-4f) - (2b-1b)),0x90\n" @@ -166,13 +164,20 @@ static __always_inline __pure bool _static_cpu_has(u16 bit) " .byte 0\n" /* repl len */ " .byte 0\n" /* pad len */ ".previous\n" - : : "i" (bit), "i" (X86_FEATURE_ALWAYS) - : : t_dynamic, t_no); + ".section .altinstr_aux,"ax"\n" + "6:\n" + " testb %[bitnum],%[cap_byte]\n" + " jnz %l[t_yes]\n" + " jmp %l[t_no]\n" + ".previous\n" + : : "i" (bit), "i" (X86_FEATURE_ALWAYS), + [bitnum] "i" (1 << (bit & 7)), + [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3]) + : : t_yes, t_no); + t_yes: return true; t_no: return false; - t_dynamic: - return __static_cpu_has(bit); }
#define static_cpu_has(bit) \ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index f31b26b..58d56c4 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1576,12 +1576,6 @@ void cpu_init(void) } #endif
-inline bool __static_cpu_has(u16 bit) -{ - return boot_cpu_has(bit); -} -EXPORT_SYMBOL_GPL(__static_cpu_has); - static void bsp_resume(void) { if (this_cpu->c_bsp_resume)
From: Borislav Petkov bp@suse.de
commit 8c725306993198f845038dc9e45a1267099867a6 upstream
... and simplify and speed up a tad.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1453842730-28463-10-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/entry/vdso/vma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c index 5471ac3..6b46648 100644 --- a/arch/x86/entry/vdso/vma.c +++ b/arch/x86/entry/vdso/vma.c @@ -255,7 +255,7 @@ static void vgetcpu_cpu_init(void *arg) #ifdef CONFIG_NUMA node = cpu_to_node(cpu); #endif - if (cpu_has(&cpu_data(cpu), X86_FEATURE_RDTSCP)) + if (static_cpu_has(X86_FEATURE_RDTSCP)) write_rdtscp_aux((node << 12) | cpu);
/*
From: Alexander Kuleshov kuleshovmail@gmail.com
commit a4733143085d6c782ac1e6c85778655b6bac1d4e upstream
We are using %rax as temporary register to check the kernel address alignment. We don't really have to since the TEST instruction does not clobber the destination operand.
Suggested-by: Brian Gerst brgerst@gmail.com Signed-off-by: Alexander Kuleshov kuleshovmail@gmail.com Signed-off-by: Borislav Petkov bp@suse.de Cc: Alexander Popov alpopov@ptsecurity.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Andy Lutomirski luto@amacapital.net Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1453531828-19291-1-git-send-email-kuleshovmail@gmai... Link: http://lkml.kernel.org/r/1453842730-28463-11-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/head_64.S | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 4034e90..734ba1d 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -76,9 +76,7 @@ startup_64: subq $_text - __START_KERNEL_map, %rbp
/* Is the address not 2M aligned? */ - movq %rbp, %rax - andl $~PMD_PAGE_MASK, %eax - testl %eax, %eax + testl $~PMD_PAGE_MASK, %ebp jnz bad_address
/*
From: Borislav Petkov bp@suse.de
commit f2cc8e0791c70833758101d9756609a08dd601ec upstream
When GCC cannot do constant folding for this macro, it falls back to cpu_has(). But static_cpu_has() is optimal and it works at all times now. So use it and speedup the fallback case.
Before we had this:
mov 0x99d674(%rip),%rdx # ffffffff81b0d9f4 <boot_cpu_data+0x34> shr $0x2e,%rdx and $0x1,%edx jne ffffffff811704e9 <do_munmap+0x3f9>
After alternatives patching, it turns into:
jmp 0xffffffff81170390 nopl (%rax) ... callq ffffffff81056e00 <mpx_notify_unmap> ffffffff81170390: mov 0x170(%r12),%rdi
Signed-off-by: Borislav Petkov bp@suse.de Cc: Joerg Roedel joro@8bytes.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1455578358-28347-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 1c9d6c5..03ca602 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -88,8 +88,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; * is not relevant. */ #define cpu_feature_enabled(bit) \ - (__builtin_constant_p(bit) && DISABLED_MASK_BIT_SET(bit) ? 0 : \ - cpu_has(&boot_cpu_data, bit)) + (__builtin_constant_p(bit) && DISABLED_MASK_BIT_SET(bit) ? 0 : static_cpu_has(bit))
#define boot_cpu_has(bit) cpu_has(&boot_cpu_data, bit)
From: Dave Hansen dave.hansen@linux.intel.com
commit dfb4a70f20c5b3880da56ee4c9484bdb4e8f1e65 upstream
There are two CPUID bits for protection keys. One is for whether the CPU contains the feature, and the other will appear set once the OS enables protection keys. Specifically:
Bit 04: OSPKE. If 1, OS has set CR4.PKE to enable Protection keys (and the RDPKRU/WRPKRU instructions)
This is because userspace can not see CR4 contents, but it can see CPUID contents.
X86_FEATURE_PKU is referred to as "PKU" in the hardware documentation:
CPUID.(EAX=07H,ECX=0H):ECX.PKU [bit 3]
X86_FEATURE_OSPKE is "OSPKU":
CPUID.(EAX=07H,ECX=0H):ECX.OSPKE [bit 4]
These are the first CPU features which need to look at the ECX word in CPUID leaf 0x7, so this patch also includes fetching that word in to the cpuinfo->x86_capability[] array.
Add it to the disabled-features mask when its config option is off. Even though we are not using it here, we also extend the REQUIRED_MASK_BIT_SET() macro to keep it mirroring the DISABLED_MASK_BIT_SET() version.
This means that in almost all code, you should use:
cpu_has(c, X86_FEATURE_PKU)
and *not* the CONFIG option.
Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: Andrew Morton akpm@linux-foundation.org Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave@sr71.net Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210201.7714C250@viggo.jf.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 59 ++++++++++++++++++++---------- arch/x86/include/asm/cpufeatures.h | 2 + arch/x86/include/asm/disabled-features.h | 15 ++++++++ arch/x86/include/asm/required-features.h | 7 ++++ arch/x86/kernel/cpu/common.c | 1 + 5 files changed, 63 insertions(+), 21 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 03ca602..7fdd717 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -26,6 +26,7 @@ enum cpuid_leafs CPUID_8000_0008_EBX, CPUID_6_EAX, CPUID_8000_000A_EDX, + CPUID_7_ECX, };
#ifdef CONFIG_X86_FEATURE_NAMES @@ -48,28 +49,42 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; test_bit(bit, (unsigned long *)((c)->x86_capability))
#define REQUIRED_MASK_BIT_SET(bit) \ - ( (((bit)>>5)==0 && (1UL<<((bit)&31) & REQUIRED_MASK0)) || \ - (((bit)>>5)==1 && (1UL<<((bit)&31) & REQUIRED_MASK1)) || \ - (((bit)>>5)==2 && (1UL<<((bit)&31) & REQUIRED_MASK2)) || \ - (((bit)>>5)==3 && (1UL<<((bit)&31) & REQUIRED_MASK3)) || \ - (((bit)>>5)==4 && (1UL<<((bit)&31) & REQUIRED_MASK4)) || \ - (((bit)>>5)==5 && (1UL<<((bit)&31) & REQUIRED_MASK5)) || \ - (((bit)>>5)==6 && (1UL<<((bit)&31) & REQUIRED_MASK6)) || \ - (((bit)>>5)==7 && (1UL<<((bit)&31) & REQUIRED_MASK7)) || \ - (((bit)>>5)==8 && (1UL<<((bit)&31) & REQUIRED_MASK8)) || \ - (((bit)>>5)==9 && (1UL<<((bit)&31) & REQUIRED_MASK9)) ) + ( (((bit)>>5)==0 && (1UL<<((bit)&31) & REQUIRED_MASK0 )) || \ + (((bit)>>5)==1 && (1UL<<((bit)&31) & REQUIRED_MASK1 )) || \ + (((bit)>>5)==2 && (1UL<<((bit)&31) & REQUIRED_MASK2 )) || \ + (((bit)>>5)==3 && (1UL<<((bit)&31) & REQUIRED_MASK3 )) || \ + (((bit)>>5)==4 && (1UL<<((bit)&31) & REQUIRED_MASK4 )) || \ + (((bit)>>5)==5 && (1UL<<((bit)&31) & REQUIRED_MASK5 )) || \ + (((bit)>>5)==6 && (1UL<<((bit)&31) & REQUIRED_MASK6 )) || \ + (((bit)>>5)==7 && (1UL<<((bit)&31) & REQUIRED_MASK7 )) || \ + (((bit)>>5)==8 && (1UL<<((bit)&31) & REQUIRED_MASK8 )) || \ + (((bit)>>5)==9 && (1UL<<((bit)&31) & REQUIRED_MASK9 )) || \ + (((bit)>>5)==10 && (1UL<<((bit)&31) & REQUIRED_MASK10)) || \ + (((bit)>>5)==11 && (1UL<<((bit)&31) & REQUIRED_MASK11)) || \ + (((bit)>>5)==12 && (1UL<<((bit)&31) & REQUIRED_MASK12)) || \ + (((bit)>>5)==13 && (1UL<<((bit)&31) & REQUIRED_MASK13)) || \ + (((bit)>>5)==13 && (1UL<<((bit)&31) & REQUIRED_MASK14)) || \ + (((bit)>>5)==13 && (1UL<<((bit)&31) & REQUIRED_MASK15)) || \ + (((bit)>>5)==14 && (1UL<<((bit)&31) & REQUIRED_MASK16)) )
#define DISABLED_MASK_BIT_SET(bit) \ - ( (((bit)>>5)==0 && (1UL<<((bit)&31) & DISABLED_MASK0)) || \ - (((bit)>>5)==1 && (1UL<<((bit)&31) & DISABLED_MASK1)) || \ - (((bit)>>5)==2 && (1UL<<((bit)&31) & DISABLED_MASK2)) || \ - (((bit)>>5)==3 && (1UL<<((bit)&31) & DISABLED_MASK3)) || \ - (((bit)>>5)==4 && (1UL<<((bit)&31) & DISABLED_MASK4)) || \ - (((bit)>>5)==5 && (1UL<<((bit)&31) & DISABLED_MASK5)) || \ - (((bit)>>5)==6 && (1UL<<((bit)&31) & DISABLED_MASK6)) || \ - (((bit)>>5)==7 && (1UL<<((bit)&31) & DISABLED_MASK7)) || \ - (((bit)>>5)==8 && (1UL<<((bit)&31) & DISABLED_MASK8)) || \ - (((bit)>>5)==9 && (1UL<<((bit)&31) & DISABLED_MASK9)) ) + ( (((bit)>>5)==0 && (1UL<<((bit)&31) & DISABLED_MASK0 )) || \ + (((bit)>>5)==1 && (1UL<<((bit)&31) & DISABLED_MASK1 )) || \ + (((bit)>>5)==2 && (1UL<<((bit)&31) & DISABLED_MASK2 )) || \ + (((bit)>>5)==3 && (1UL<<((bit)&31) & DISABLED_MASK3 )) || \ + (((bit)>>5)==4 && (1UL<<((bit)&31) & DISABLED_MASK4 )) || \ + (((bit)>>5)==5 && (1UL<<((bit)&31) & DISABLED_MASK5 )) || \ + (((bit)>>5)==6 && (1UL<<((bit)&31) & DISABLED_MASK6 )) || \ + (((bit)>>5)==7 && (1UL<<((bit)&31) & DISABLED_MASK7 )) || \ + (((bit)>>5)==8 && (1UL<<((bit)&31) & DISABLED_MASK8 )) || \ + (((bit)>>5)==9 && (1UL<<((bit)&31) & DISABLED_MASK9 )) || \ + (((bit)>>5)==10 && (1UL<<((bit)&31) & DISABLED_MASK10)) || \ + (((bit)>>5)==11 && (1UL<<((bit)&31) & DISABLED_MASK11)) || \ + (((bit)>>5)==12 && (1UL<<((bit)&31) & DISABLED_MASK12)) || \ + (((bit)>>5)==13 && (1UL<<((bit)&31) & DISABLED_MASK13)) || \ + (((bit)>>5)==13 && (1UL<<((bit)&31) & DISABLED_MASK14)) || \ + (((bit)>>5)==13 && (1UL<<((bit)&31) & DISABLED_MASK15)) || \ + (((bit)>>5)==14 && (1UL<<((bit)&31) & DISABLED_MASK16)) )
#define cpu_has(c, bit) \ (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ @@ -79,6 +94,10 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ x86_this_cpu_test_bit(bit, (unsigned long *)&cpu_info.x86_capability))
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */ +#define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */ +#define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ + /* * This macro is for detection of features which need kernel * infrastructure to be used. It may *not* directly test the CPU diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 255ea74..6ebb4c2d 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -12,7 +12,7 @@ /* * Defines x86 CPU feature bits */ -#define NCAPINTS 16 /* N 32-bit words worth of info */ +#define NCAPINTS 17 /* N 32-bit words worth of info */ #define NBUGINTS 1 /* N 32-bit bug flags */
/* diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 8b17c2a..522a069 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -30,6 +30,14 @@ # define DISABLE_PCID (1<<(X86_FEATURE_PCID & 31)) #endif /* CONFIG_X86_64 */
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS +# define DISABLE_PKU (1<<(X86_FEATURE_PKU)) +# define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE)) +#else +# define DISABLE_PKU 0 +# define DISABLE_OSPKE 0 +#endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */ + /* * Make sure to add features to the correct mask */ @@ -43,5 +51,12 @@ #define DISABLED_MASK7 0 #define DISABLED_MASK8 0 #define DISABLED_MASK9 (DISABLE_MPX) +#define DISABLED_MASK10 0 +#define DISABLED_MASK11 0 +#define DISABLED_MASK12 0 +#define DISABLED_MASK13 0 +#define DISABLED_MASK14 0 +#define DISABLED_MASK15 0 +#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE)
#endif /* _ASM_X86_DISABLED_FEATURES_H */ diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h index 5c6e4fb..4916144 100644 --- a/arch/x86/include/asm/required-features.h +++ b/arch/x86/include/asm/required-features.h @@ -92,5 +92,12 @@ #define REQUIRED_MASK7 0 #define REQUIRED_MASK8 0 #define REQUIRED_MASK9 0 +#define REQUIRED_MASK10 0 +#define REQUIRED_MASK11 0 +#define REQUIRED_MASK12 0 +#define REQUIRED_MASK13 0 +#define REQUIRED_MASK14 0 +#define REQUIRED_MASK15 0 +#define REQUIRED_MASK16 0
#endif /* _ASM_X86_REQUIRED_FEATURES_H */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 58d56c4..d6a7b6f2 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -693,6 +693,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c) c->x86_capability[CPUID_7_0_EBX] = ebx;
c->x86_capability[CPUID_6_EAX] = cpuid_eax(0x00000006); + c->x86_capability[CPUID_7_ECX] = ecx; }
/* Extended state features: level 0x0000000d */
From: Dave Hansen dave.hansen@linux.intel.com
commit 0d47638f80a02b15869f1fe1fc09e5bf996750fd upstream
Kirill Shutemov pointed this out to me.
The tip tree currently has commit:
dfb4a70f2 [x86/cpufeature, x86/mm/pkeys: Add protection keys related CPUID definitions]
whioch added support for two new CPUID bits: X86_FEATURE_PKU and X86_FEATURE_OSPKE. But, those bits were mis-merged and put in cpufeature.h instead of cpufeatures.h.
This didn't cause any breakage *except* it keeps the "ospke" and "pku" bits from showing up in cpuinfo.
Now cpuinfo has the two new flags:
flags : ... pku ospke
BTW, is it really wise to have cpufeature.h and cpufeatures.h? It seems like they can only cause confusion and mahem with tab completion.
Reported-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Acked-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@kernel.org Cc: Dave Hansen dave@sr71.net Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20160310221213.06F9DB53@viggo.jf.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 4 ---- arch/x86/include/asm/cpufeatures.h | 4 ++++ 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 7fdd717..b953fb7 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -94,10 +94,6 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ x86_this_cpu_test_bit(bit, (unsigned long *)&cpu_info.x86_capability))
-/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */ -#define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */ -#define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ - /* * This macro is for detection of features which need kernel * infrastructure to be used. It may *not* directly test the CPU diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 6ebb4c2d..f19b901 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -276,6 +276,10 @@ #define X86_FEATURE_PAUSEFILTER (15*32+10) /* filtered pause intercept */ #define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */ +#define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */ +#define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ + /* * BUG word(s) */
From: Yazen Ghannam Yazen.Ghannam@amd.com
commit 71faad43060d3d2040583635fbf7d1bdb3d04118 upstream
Add a new CPUID leaf to hold the contents of CPUID 0x80000007_EBX (RasCap).
Define bits that are currently in use:
Bit 0: McaOverflowRecov Bit 1: SUCCOR Bit 3: ScalableMca
Signed-off-by: Yazen Ghannam Yazen.Ghannam@amd.com [ Shorten comment. ] Signed-off-by: Borislav Petkov bp@suse.de Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Tony Luck tony.luck@intel.com Cc: linux-edac linux-edac@vger.kernel.org Link: http://lkml.kernel.org/r/1462971509-3856-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 1 + arch/x86/include/asm/cpufeatures.h | 7 ++++++- arch/x86/kernel/cpu/common.c | 10 +++++++--- 3 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index b953fb7..1d02ad6 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -27,6 +27,7 @@ enum cpuid_leafs CPUID_6_EAX, CPUID_8000_000A_EDX, CPUID_7_ECX, + CPUID_8000_0007_EBX, };
#ifdef CONFIG_X86_FEATURE_NAMES diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index f19b901..205ce70 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -12,7 +12,7 @@ /* * Defines x86 CPU feature bits */ -#define NCAPINTS 17 /* N 32-bit words worth of info */ +#define NCAPINTS 18 /* N 32-bit words worth of info */ #define NBUGINTS 1 /* N 32-bit bug flags */
/* @@ -280,6 +280,11 @@ #define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */ #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */
+/* AMD-defined CPU features, CPUID level 0x80000007 (ebx), word 17 */ +#define X86_FEATURE_OVERFLOW_RECOV (17*32+0) /* MCA overflow recovery support */ +#define X86_FEATURE_SUCCOR (17*32+1) /* Uncorrectable error containment and recovery */ +#define X86_FEATURE_SMCA (17*32+3) /* Scalable MCA */ + /* * BUG word(s) */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index d6a7b6f2..814276d 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -741,6 +741,13 @@ void get_cpu_cap(struct cpuinfo_x86 *c) } }
+ if (c->extended_cpuid_level >= 0x80000007) { + cpuid(0x80000007, &eax, &ebx, &ecx, &edx); + + c->x86_capability[CPUID_8000_0007_EBX] = ebx; + c->x86_power = edx; + } + if (c->extended_cpuid_level >= 0x80000008) { cpuid(0x80000008, &eax, &ebx, &ecx, &edx);
@@ -753,9 +760,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c) c->x86_phys_bits = 36; #endif
- if (c->extended_cpuid_level >= 0x80000007) - c->x86_power = cpuid_edx(0x80000007); - if (c->extended_cpuid_level >= 0x8000000a) c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
From: Dave Hansen dave.hansen@linux.intel.com
commit e8df1a95b685af84a81698199ee206e0e66a8b44 upstream
When I added support for the Memory Protection Keys processor feature, I had to reindent the REQUIRED/DISABLED_MASK macros, and also consult the later cpufeature words.
I'm not quite sure how I bungled it, but I consulted the wrong word at the end. This only affected required or disabled cpu features in cpufeature words 14, 15 and 16. So, only Protection Keys itself was screwed over here.
The result was that if you disabled pkeys in your .config, you might still see some code show up that should have been compiled out. There should be no functional problems, though.
In verifying this patch I also realized that the DISABLE_PKU/OSPKE macros were defined backwards and that the cpu_has() check in setup_pku() was not doing the compile-time disabled checks.
So also fix the macro for DISABLE_PKU/OSPKE and add a compile-time check for pkeys being enabled in setup_pku().
Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Cc: stable@vger.kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Dave Hansen dave@sr71.net Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Fixes: dfb4a70f20c5 ("x86/cpufeature, x86/mm/pkeys: Add protection keys related CPUID definitions") Link: http://lkml.kernel.org/r/20160513221328.C200930B@viggo.jf.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 12 ++++++------ arch/x86/include/asm/disabled-features.h | 6 +++--- 2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 1d02ad6..aa7785e 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -64,9 +64,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; (((bit)>>5)==11 && (1UL<<((bit)&31) & REQUIRED_MASK11)) || \ (((bit)>>5)==12 && (1UL<<((bit)&31) & REQUIRED_MASK12)) || \ (((bit)>>5)==13 && (1UL<<((bit)&31) & REQUIRED_MASK13)) || \ - (((bit)>>5)==13 && (1UL<<((bit)&31) & REQUIRED_MASK14)) || \ - (((bit)>>5)==13 && (1UL<<((bit)&31) & REQUIRED_MASK15)) || \ - (((bit)>>5)==14 && (1UL<<((bit)&31) & REQUIRED_MASK16)) ) + (((bit)>>5)==14 && (1UL<<((bit)&31) & REQUIRED_MASK14)) || \ + (((bit)>>5)==15 && (1UL<<((bit)&31) & REQUIRED_MASK15)) || \ + (((bit)>>5)==16 && (1UL<<((bit)&31) & REQUIRED_MASK16)) )
#define DISABLED_MASK_BIT_SET(bit) \ ( (((bit)>>5)==0 && (1UL<<((bit)&31) & DISABLED_MASK0 )) || \ @@ -83,9 +83,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; (((bit)>>5)==11 && (1UL<<((bit)&31) & DISABLED_MASK11)) || \ (((bit)>>5)==12 && (1UL<<((bit)&31) & DISABLED_MASK12)) || \ (((bit)>>5)==13 && (1UL<<((bit)&31) & DISABLED_MASK13)) || \ - (((bit)>>5)==13 && (1UL<<((bit)&31) & DISABLED_MASK14)) || \ - (((bit)>>5)==13 && (1UL<<((bit)&31) & DISABLED_MASK15)) || \ - (((bit)>>5)==14 && (1UL<<((bit)&31) & DISABLED_MASK16)) ) + (((bit)>>5)==14 && (1UL<<((bit)&31) & DISABLED_MASK14)) || \ + (((bit)>>5)==15 && (1UL<<((bit)&31) & DISABLED_MASK15)) || \ + (((bit)>>5)==16 && (1UL<<((bit)&31) & DISABLED_MASK16)) )
#define cpu_has(c, bit) \ (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 522a069..0403b22 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -31,11 +31,11 @@ #endif /* CONFIG_X86_64 */
#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS -# define DISABLE_PKU (1<<(X86_FEATURE_PKU)) -# define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE)) -#else # define DISABLE_PKU 0 # define DISABLE_OSPKE 0 +#else +# define DISABLE_PKU (1<<(X86_FEATURE_PKU & 31)) +# define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31)) #endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */
/*
From: Dave Hansen dave.hansen@linux.intel.com
commit 6e17cb9c2d5efd8fcc3934e983733302b9912ff8 upstream
We had a new CPUID "NCAPINT" word added, but the REQUIRED_MASK and DISABLED_MASK macros did not get updated. Update them.
None of the features was needed in these masks, so there was no harm, but we should keep them updated anyway.
Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave@sr71.net Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20160629200107.8D3C9A31@viggo.jf.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 6 ++++-- arch/x86/include/asm/disabled-features.h | 1 + arch/x86/include/asm/required-features.h | 1 + 3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index aa7785e..c7e38da 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -66,7 +66,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; (((bit)>>5)==13 && (1UL<<((bit)&31) & REQUIRED_MASK13)) || \ (((bit)>>5)==14 && (1UL<<((bit)&31) & REQUIRED_MASK14)) || \ (((bit)>>5)==15 && (1UL<<((bit)&31) & REQUIRED_MASK15)) || \ - (((bit)>>5)==16 && (1UL<<((bit)&31) & REQUIRED_MASK16)) ) + (((bit)>>5)==16 && (1UL<<((bit)&31) & REQUIRED_MASK16)) || \ + (((bit)>>5)==17 && (1UL<<((bit)&31) & REQUIRED_MASK17)))
#define DISABLED_MASK_BIT_SET(bit) \ ( (((bit)>>5)==0 && (1UL<<((bit)&31) & DISABLED_MASK0 )) || \ @@ -85,7 +86,8 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; (((bit)>>5)==13 && (1UL<<((bit)&31) & DISABLED_MASK13)) || \ (((bit)>>5)==14 && (1UL<<((bit)&31) & DISABLED_MASK14)) || \ (((bit)>>5)==15 && (1UL<<((bit)&31) & DISABLED_MASK15)) || \ - (((bit)>>5)==16 && (1UL<<((bit)&31) & DISABLED_MASK16)) ) + (((bit)>>5)==16 && (1UL<<((bit)&31) & DISABLED_MASK16)) || \ + (((bit)>>5)==17 && (1UL<<((bit)&31) & DISABLED_MASK17)))
#define cpu_has(c, bit) \ (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 0403b22..ab6b05c 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -58,5 +58,6 @@ #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE) +#define DISABLED_MASK17 0
#endif /* _ASM_X86_DISABLED_FEATURES_H */ diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h index 4916144..fad4277 100644 --- a/arch/x86/include/asm/required-features.h +++ b/arch/x86/include/asm/required-features.h @@ -99,5 +99,6 @@ #define REQUIRED_MASK14 0 #define REQUIRED_MASK15 0 #define REQUIRED_MASK16 0 +#define REQUIRED_MASK17 0
#endif /* _ASM_X86_REQUIRED_FEATURES_H */
From: Dave Hansen dave.hansen@linux.intel.com
commit 1e61f78baf893c7eb49f633d23ccbb420c8f808e upstream
x86 has two macros which allow us to evaluate some CPUID-based features at compile time:
REQUIRED_MASK_BIT_SET() DISABLED_MASK_BIT_SET()
They're both defined by having the compiler check the bit argument against some constant masks of features.
But, when adding new CPUID leaves, we need to check new words for these macros. So make sure that those macros and the REQUIRED_MASK* and DISABLED_MASK* get updated when necessary.
This looks kinda silly to have an open-coded value ("18" in this case) open-coded in 5 places in the code. But, we really do need 5 places updated when NCAPINTS gets bumped, so now we just force the issue.
Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave@sr71.net Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20160629200108.92466F6F@viggo.jf.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 8 ++++++-- arch/x86/include/asm/disabled-features.h | 1 + arch/x86/include/asm/required-features.h | 1 + 3 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index c7e38da..3d5a6b5 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -67,7 +67,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; (((bit)>>5)==14 && (1UL<<((bit)&31) & REQUIRED_MASK14)) || \ (((bit)>>5)==15 && (1UL<<((bit)&31) & REQUIRED_MASK15)) || \ (((bit)>>5)==16 && (1UL<<((bit)&31) & REQUIRED_MASK16)) || \ - (((bit)>>5)==17 && (1UL<<((bit)&31) & REQUIRED_MASK17))) + (((bit)>>5)==17 && (1UL<<((bit)&31) & REQUIRED_MASK17)) || \ + REQUIRED_MASK_CHECK || \ + BUILD_BUG_ON_ZERO(NCAPINTS != 18))
#define DISABLED_MASK_BIT_SET(bit) \ ( (((bit)>>5)==0 && (1UL<<((bit)&31) & DISABLED_MASK0 )) || \ @@ -87,7 +89,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; (((bit)>>5)==14 && (1UL<<((bit)&31) & DISABLED_MASK14)) || \ (((bit)>>5)==15 && (1UL<<((bit)&31) & DISABLED_MASK15)) || \ (((bit)>>5)==16 && (1UL<<((bit)&31) & DISABLED_MASK16)) || \ - (((bit)>>5)==17 && (1UL<<((bit)&31) & DISABLED_MASK17))) + (((bit)>>5)==17 && (1UL<<((bit)&31) & DISABLED_MASK17)) || \ + DISABLED_MASK_CHECK || \ + BUILD_BUG_ON_ZERO(NCAPINTS != 18))
#define cpu_has(c, bit) \ (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index ab6b05c..21c5ac1 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -59,5 +59,6 @@ #define DISABLED_MASK15 0 #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE) #define DISABLED_MASK17 0 +#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
#endif /* _ASM_X86_DISABLED_FEATURES_H */ diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h index fad4277..fac9a5c 100644 --- a/arch/x86/include/asm/required-features.h +++ b/arch/x86/include/asm/required-features.h @@ -100,5 +100,6 @@ #define REQUIRED_MASK15 0 #define REQUIRED_MASK16 0 #define REQUIRED_MASK17 0 +#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
#endif /* _ASM_X86_REQUIRED_FEATURES_H */
From: Dave Hansen dave.hansen@linux.intel.com
commit 8eda072e9d7c3429a372e3635dc5851f4a42dee1 upstream
Every time we add a word to our cpu features, we need to add something like this in two places:
(((bit)>>5)==16 && (1UL<<((bit)&31) & REQUIRED_MASK16))
The trick is getting the "16" in this case in both places. I've now screwed this up twice, so as pennance, I've come up with this patch to keep me and other poor souls from doing the same.
I also commented the logic behind the bit manipulation showcased above.
Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave@sr71.net Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20160629200110.1BA8949E@viggo.jf.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 90 +++++++++++++++++++++---------------- 1 file changed, 50 insertions(+), 40 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 3d5a6b5..dd00898 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -49,48 +49,58 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; #define test_cpu_cap(c, bit) \ test_bit(bit, (unsigned long *)((c)->x86_capability))
-#define REQUIRED_MASK_BIT_SET(bit) \ - ( (((bit)>>5)==0 && (1UL<<((bit)&31) & REQUIRED_MASK0 )) || \ - (((bit)>>5)==1 && (1UL<<((bit)&31) & REQUIRED_MASK1 )) || \ - (((bit)>>5)==2 && (1UL<<((bit)&31) & REQUIRED_MASK2 )) || \ - (((bit)>>5)==3 && (1UL<<((bit)&31) & REQUIRED_MASK3 )) || \ - (((bit)>>5)==4 && (1UL<<((bit)&31) & REQUIRED_MASK4 )) || \ - (((bit)>>5)==5 && (1UL<<((bit)&31) & REQUIRED_MASK5 )) || \ - (((bit)>>5)==6 && (1UL<<((bit)&31) & REQUIRED_MASK6 )) || \ - (((bit)>>5)==7 && (1UL<<((bit)&31) & REQUIRED_MASK7 )) || \ - (((bit)>>5)==8 && (1UL<<((bit)&31) & REQUIRED_MASK8 )) || \ - (((bit)>>5)==9 && (1UL<<((bit)&31) & REQUIRED_MASK9 )) || \ - (((bit)>>5)==10 && (1UL<<((bit)&31) & REQUIRED_MASK10)) || \ - (((bit)>>5)==11 && (1UL<<((bit)&31) & REQUIRED_MASK11)) || \ - (((bit)>>5)==12 && (1UL<<((bit)&31) & REQUIRED_MASK12)) || \ - (((bit)>>5)==13 && (1UL<<((bit)&31) & REQUIRED_MASK13)) || \ - (((bit)>>5)==14 && (1UL<<((bit)&31) & REQUIRED_MASK14)) || \ - (((bit)>>5)==15 && (1UL<<((bit)&31) & REQUIRED_MASK15)) || \ - (((bit)>>5)==16 && (1UL<<((bit)&31) & REQUIRED_MASK16)) || \ - (((bit)>>5)==17 && (1UL<<((bit)&31) & REQUIRED_MASK17)) || \ - REQUIRED_MASK_CHECK || \ +/* + * There are 32 bits/features in each mask word. The high bits + * (selected with (bit>>5) give us the word number and the low 5 + * bits give us the bit/feature number inside the word. + * (1UL<<((bit)&31) gives us a mask for the feature_bit so we can + * see if it is set in the mask word. + */ +#define CHECK_BIT_IN_MASK_WORD(maskname, word, bit) \ + (((bit)>>5)==(word) && (1UL<<((bit)&31) & maskname##word )) + +#define REQUIRED_MASK_BIT_SET(feature_bit) \ + ( CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 0, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 1, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 2, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 3, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 4, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 5, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 6, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 7, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 8, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 9, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 10, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 11, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 12, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 13, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 14, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) || \ + REQUIRED_MASK_CHECK || \ BUILD_BUG_ON_ZERO(NCAPINTS != 18))
-#define DISABLED_MASK_BIT_SET(bit) \ - ( (((bit)>>5)==0 && (1UL<<((bit)&31) & DISABLED_MASK0 )) || \ - (((bit)>>5)==1 && (1UL<<((bit)&31) & DISABLED_MASK1 )) || \ - (((bit)>>5)==2 && (1UL<<((bit)&31) & DISABLED_MASK2 )) || \ - (((bit)>>5)==3 && (1UL<<((bit)&31) & DISABLED_MASK3 )) || \ - (((bit)>>5)==4 && (1UL<<((bit)&31) & DISABLED_MASK4 )) || \ - (((bit)>>5)==5 && (1UL<<((bit)&31) & DISABLED_MASK5 )) || \ - (((bit)>>5)==6 && (1UL<<((bit)&31) & DISABLED_MASK6 )) || \ - (((bit)>>5)==7 && (1UL<<((bit)&31) & DISABLED_MASK7 )) || \ - (((bit)>>5)==8 && (1UL<<((bit)&31) & DISABLED_MASK8 )) || \ - (((bit)>>5)==9 && (1UL<<((bit)&31) & DISABLED_MASK9 )) || \ - (((bit)>>5)==10 && (1UL<<((bit)&31) & DISABLED_MASK10)) || \ - (((bit)>>5)==11 && (1UL<<((bit)&31) & DISABLED_MASK11)) || \ - (((bit)>>5)==12 && (1UL<<((bit)&31) & DISABLED_MASK12)) || \ - (((bit)>>5)==13 && (1UL<<((bit)&31) & DISABLED_MASK13)) || \ - (((bit)>>5)==14 && (1UL<<((bit)&31) & DISABLED_MASK14)) || \ - (((bit)>>5)==15 && (1UL<<((bit)&31) & DISABLED_MASK15)) || \ - (((bit)>>5)==16 && (1UL<<((bit)&31) & DISABLED_MASK16)) || \ - (((bit)>>5)==17 && (1UL<<((bit)&31) & DISABLED_MASK17)) || \ - DISABLED_MASK_CHECK || \ +#define DISABLED_MASK_BIT_SET(feature_bit) \ + ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 0, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 1, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 2, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 3, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 4, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 5, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 6, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 7, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 8, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 9, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 10, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 11, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 12, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 13, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 14, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) || \ + DISABLED_MASK_CHECK || \ BUILD_BUG_ON_ZERO(NCAPINTS != 18))
#define cpu_has(c, bit) \
From: Andy Lutomirski luto@kernel.org
commit 3df8d9208569ef0b2313e516566222d745f3b94b upstream.
A typo (or mis-merge?) resulted in leaf 6 only being probed if cpuid_level >= 7.
Fixes: 2ccd71f1b278 ("x86/cpufeature: Move some of the scattered feature bits to x86_capability") Signed-off-by: Andy Lutomirski luto@kernel.org Acked-by: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Link: http://lkml.kernel.org/r/6ea30c0e9daec21e488b54761881a6dfcf3e04d0.1481825597... Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/common.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 814276d..736e284 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -686,13 +686,14 @@ void get_cpu_cap(struct cpuinfo_x86 *c) c->x86_capability[CPUID_1_EDX] = edx; }
+ /* Thermal and Power Management Leaf: level 0x00000006 (eax) */ + if (c->cpuid_level >= 0x00000006) + c->x86_capability[CPUID_6_EAX] = cpuid_eax(0x00000006); + /* Additional Intel-defined flags: level 0x00000007 */ if (c->cpuid_level >= 0x00000007) { cpuid_count(0x00000007, 0, &eax, &ebx, &ecx, &edx); - c->x86_capability[CPUID_7_0_EBX] = ebx; - - c->x86_capability[CPUID_6_EAX] = cpuid_eax(0x00000006); c->x86_capability[CPUID_7_ECX] = ecx; }
From: David Woodhouse dwmw@amazon.co.uk
(cherry picked from commit 95ca0ee8636059ea2800dfbac9ecac6212d6b38f)
This is a pure feature bits leaf. There are two AVX512 feature bits in it already which were handled as scattered bits, and three more from this leaf are going to be added for speculation control features.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Reviewed-by: Borislav Petkov bp@suse.de Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeature.h | 7 +++++-- arch/x86/include/asm/cpufeatures.h | 6 +++++- arch/x86/include/asm/disabled-features.h | 3 ++- arch/x86/include/asm/required-features.h | 3 ++- arch/x86/kernel/cpu/common.c | 1 + 5 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index dd00898..d72c1db 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -28,6 +28,7 @@ enum cpuid_leafs CPUID_8000_000A_EDX, CPUID_7_ECX, CPUID_8000_0007_EBX, + CPUID_7_EDX, };
#ifdef CONFIG_X86_FEATURE_NAMES @@ -78,8 +79,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) || \ REQUIRED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 18)) + BUILD_BUG_ON_ZERO(NCAPINTS != 19))
#define DISABLED_MASK_BIT_SET(feature_bit) \ ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 0, feature_bit) || \ @@ -100,8 +102,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) || \ DISABLED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 18)) + BUILD_BUG_ON_ZERO(NCAPINTS != 19))
#define cpu_has(c, bit) \ (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 205ce70..da14ca8 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -12,7 +12,7 @@ /* * Defines x86 CPU feature bits */ -#define NCAPINTS 18 /* N 32-bit words worth of info */ +#define NCAPINTS 19 /* N 32-bit words worth of info */ #define NBUGINTS 1 /* N 32-bit bug flags */
/* @@ -285,6 +285,10 @@ #define X86_FEATURE_SUCCOR (17*32+1) /* Uncorrectable error containment and recovery */ #define X86_FEATURE_SMCA (17*32+3) /* Scalable MCA */
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */ +#define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */ +#define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */ + /* * BUG word(s) */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 21c5ac1..1f8cca4 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -59,6 +59,7 @@ #define DISABLED_MASK15 0 #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE) #define DISABLED_MASK17 0 -#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18) +#define DISABLED_MASK18 0 +#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
#endif /* _ASM_X86_DISABLED_FEATURES_H */ diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h index fac9a5c..6847d85 100644 --- a/arch/x86/include/asm/required-features.h +++ b/arch/x86/include/asm/required-features.h @@ -100,6 +100,7 @@ #define REQUIRED_MASK15 0 #define REQUIRED_MASK16 0 #define REQUIRED_MASK17 0 -#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18) +#define REQUIRED_MASK18 0 +#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
#endif /* _ASM_X86_REQUIRED_FEATURES_H */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 736e284..ac7c526 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -695,6 +695,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c) cpuid_count(0x00000007, 0, &eax, &ebx, &ecx, &edx); c->x86_capability[CPUID_7_0_EBX] = ebx; c->x86_capability[CPUID_7_ECX] = ecx; + c->x86_capability[CPUID_7_EDX] = edx; }
/* Extended state features: level 0x0000000d */
From: David Woodhouse dwmw@amazon.co.uk
(cherry picked from commit fc67dd70adb711a45d2ef34e12d1a8be75edde61)
Add three feature bits exposed by new microcode on Intel CPUs for speculation control.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Reviewed-by: Borislav Petkov bp@suse.de Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-3-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index da14ca8..f50e857 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -288,6 +288,9 @@ /* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */ #define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */ #define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */ +#define X86_FEATURE_SPEC_CTRL (18*32+26) /* Speculation Control (IBRS + IBPB) */ +#define X86_FEATURE_STIBP (18*32+27) /* Single Thread Indirect Branch Predictors */ +#define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */
/* * BUG word(s)
From: David Woodhouse dwmw@amazon.co.uk
(cherry picked from commit 5d10cbc91d9eb5537998b65608441b592eec65e7)
AMD exposes the PRED_CMD/SPEC_CTRL MSRs slightly differently to Intel. See http://lkml.kernel.org/r/2b3e25cc-286d-8bd0-aeaf-9ac4aae39de8@amd.com
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Tom Lendacky thomas.lendacky@amd.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-4-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index f50e857..a5671b8 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -251,6 +251,9 @@
/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */ #define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */ +#define X86_FEATURE_AMD_PRED_CMD (13*32+12) /* Prediction Command MSR (AMD) */ +#define X86_FEATURE_AMD_SPEC_CTRL (13*32+14) /* Speculation Control MSR only (AMD) */ +#define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors (AMD) */
/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */ #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
From: David Woodhouse dwmw@amazon.co.uk
(cherry picked from commit 1e340c60d0dd3ae07b5bedc16a0469c14b9f3410)
Add MSR and bit definitions for SPEC_CTRL, PRED_CMD and ARCH_CAPABILITIES.
See Intel's 336996-Speculative-Execution-Side-Channel-Mitigations.pdf
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-5-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/msr-index.h | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index b8911ae..f4701f0 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -32,6 +32,13 @@ #define EFER_FFXSR (1<<_EFER_FFXSR)
/* Intel MSRs. Some also available on other CPUs */ +#define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */ +#define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */ +#define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */ + +#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */ +#define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */ + #define MSR_IA32_PERFCTR0 0x000000c1 #define MSR_IA32_PERFCTR1 0x000000c2 #define MSR_FSB_FREQ 0x000000cd @@ -45,6 +52,11 @@ #define SNB_C3_AUTO_UNDEMOTE (1UL << 28)
#define MSR_MTRRcap 0x000000fe + +#define MSR_IA32_ARCH_CAPABILITIES 0x0000010a +#define ARCH_CAP_RDCL_NO (1 << 0) /* Not susceptible to Meltdown */ +#define ARCH_CAP_IBRS_ALL (1 << 1) /* Enhanced IBRS support */ + #define MSR_IA32_BBL_CR_CTL 0x00000119 #define MSR_IA32_BBL_CR_CTL3 0x0000011e
From: David Woodhouse dwmw@amazon.co.uk
(cherry picked from commit fec9434a12f38d3aeafeb75711b71d8a1fdef621)
Also, for CPUs which don't speculate at all, don't report that they're vulnerable to the Spectre variants either.
Leave the cpu_no_meltdown[] match table with just X86_VENDOR_AMD in it for now, even though that could be done with a simple comparison, on the assumption that we'll have more to add.
Based on suggestions from Dave Hansen and Alan Cox.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Reviewed-by: Borislav Petkov bp@suse.de Acked-by: Dave Hansen dave.hansen@intel.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-6-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/common.c | 48 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 43 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index ac7c526..d6c097c 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -43,6 +43,8 @@ #include <asm/pat.h> #include <asm/microcode.h> #include <asm/microcode_intel.h> +#include <asm/intel-family.h> +#include <asm/cpu_device_id.h>
#ifdef CONFIG_X86_LOCAL_APIC #include <asm/uv/uv.h> @@ -794,6 +796,41 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c) #endif }
+static const __initdata struct x86_cpu_id cpu_no_speculation[] = { + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL, X86_FEATURE_ANY }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW, X86_FEATURE_ANY }, + { X86_VENDOR_CENTAUR, 5 }, + { X86_VENDOR_INTEL, 5 }, + { X86_VENDOR_NSC, 5 }, + { X86_VENDOR_ANY, 4 }, + {} +}; + +static const __initdata struct x86_cpu_id cpu_no_meltdown[] = { + { X86_VENDOR_AMD }, + {} +}; + +static bool __init cpu_vulnerable_to_meltdown(struct cpuinfo_x86 *c) +{ + u64 ia32_cap = 0; + + if (x86_match_cpu(cpu_no_meltdown)) + return false; + + if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES)) + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap); + + /* Rogue Data Cache Load? No! */ + if (ia32_cap & ARCH_CAP_RDCL_NO) + return false; + + return true; +} + /* * Do minimum CPU detection early. * Fields really needed: vendor, cpuid_level, family, model, mask, @@ -840,11 +877,12 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
setup_force_cpu_cap(X86_FEATURE_ALWAYS);
- if (c->x86_vendor != X86_VENDOR_AMD) - setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); - - setup_force_cpu_bug(X86_BUG_SPECTRE_V1); - setup_force_cpu_bug(X86_BUG_SPECTRE_V2); + if (!x86_match_cpu(cpu_no_speculation)) { + if (cpu_vulnerable_to_meltdown(c)) + setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); + setup_force_cpu_bug(X86_BUG_SPECTRE_V1); + setup_force_cpu_bug(X86_BUG_SPECTRE_V2); + }
fpu__init_system(c);
From: David Woodhouse dwmw@amazon.co.uk
(cherry picked from commit a5b2966364538a0e68c9fa29bc0a3a1651799035)
This doesn't refuse to load the affected microcodes; it just refuses to use the Spectre v2 mitigation features if they're detected, by clearing the appropriate feature bits.
The AMD CPUID bits are handled here too, because hypervisors *may* have been exposing those bits even on Intel chips, for fine-grained control of what's available.
It is non-trivial to use x86_match_cpu() for this table because that doesn't handle steppings. And the approach taken in commit bd9240a18 almost made me lose my lunch.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-7-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/intel-family.h | 5 ++- arch/x86/kernel/cpu/intel.c | 67 +++++++++++++++++++++++++++++++++++ 2 files changed, 71 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h index 6999f7d..12fa187 100644 --- a/arch/x86/include/asm/intel-family.h +++ b/arch/x86/include/asm/intel-family.h @@ -12,6 +12,7 @@ */
#define INTEL_FAM6_CORE_YONAH 0x0E + #define INTEL_FAM6_CORE2_MEROM 0x0F #define INTEL_FAM6_CORE2_MEROM_L 0x16 #define INTEL_FAM6_CORE2_PENRYN 0x17 @@ -20,6 +21,7 @@ #define INTEL_FAM6_NEHALEM 0x1E #define INTEL_FAM6_NEHALEM_EP 0x1A #define INTEL_FAM6_NEHALEM_EX 0x2E + #define INTEL_FAM6_WESTMERE 0x25 #define INTEL_FAM6_WESTMERE2 0x1F #define INTEL_FAM6_WESTMERE_EP 0x2C @@ -36,9 +38,9 @@ #define INTEL_FAM6_HASWELL_GT3E 0x46
#define INTEL_FAM6_BROADWELL_CORE 0x3D -#define INTEL_FAM6_BROADWELL_XEON_D 0x56 #define INTEL_FAM6_BROADWELL_GT3E 0x47 #define INTEL_FAM6_BROADWELL_X 0x4F +#define INTEL_FAM6_BROADWELL_XEON_D 0x56
#define INTEL_FAM6_SKYLAKE_MOBILE 0x4E #define INTEL_FAM6_SKYLAKE_DESKTOP 0x5E @@ -60,6 +62,7 @@ #define INTEL_FAM6_ATOM_MERRIFIELD2 0x5A /* Annidale */ #define INTEL_FAM6_ATOM_GOLDMONT 0x5C #define INTEL_FAM6_ATOM_DENVERTON 0x5F /* Goldmont Microserver */ +#define INTEL_FAM6_ATOM_GEMINI_LAKE 0x7A
/* Xeon Phi */
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 9299e3b..23ba9cc 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -13,6 +13,7 @@ #include <asm/msr.h> #include <asm/bugs.h> #include <asm/cpu.h> +#include <asm/intel-family.h>
#ifdef CONFIG_X86_64 #include <linux/topology.h> @@ -25,6 +26,59 @@ #include <asm/apic.h> #endif
+/* + * Early microcode releases for the Spectre v2 mitigation were broken. + * Information taken from; + * - https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/microcode-upd... + * - https://kb.vmware.com/s/article/52345 + * - Microcode revisions observed in the wild + * - Release note from 20180108 microcode release + */ +struct sku_microcode { + u8 model; + u8 stepping; + u32 microcode; +}; +static const struct sku_microcode spectre_bad_microcodes[] = { + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x0B, 0x84 }, + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x0A, 0x84 }, + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x09, 0x84 }, + { INTEL_FAM6_KABYLAKE_MOBILE, 0x0A, 0x84 }, + { INTEL_FAM6_KABYLAKE_MOBILE, 0x09, 0x84 }, + { INTEL_FAM6_SKYLAKE_X, 0x03, 0x0100013e }, + { INTEL_FAM6_SKYLAKE_X, 0x04, 0x0200003c }, + { INTEL_FAM6_SKYLAKE_MOBILE, 0x03, 0xc2 }, + { INTEL_FAM6_SKYLAKE_DESKTOP, 0x03, 0xc2 }, + { INTEL_FAM6_BROADWELL_CORE, 0x04, 0x28 }, + { INTEL_FAM6_BROADWELL_GT3E, 0x01, 0x1b }, + { INTEL_FAM6_BROADWELL_XEON_D, 0x02, 0x14 }, + { INTEL_FAM6_BROADWELL_XEON_D, 0x03, 0x07000011 }, + { INTEL_FAM6_BROADWELL_X, 0x01, 0x0b000025 }, + { INTEL_FAM6_HASWELL_ULT, 0x01, 0x21 }, + { INTEL_FAM6_HASWELL_GT3E, 0x01, 0x18 }, + { INTEL_FAM6_HASWELL_CORE, 0x03, 0x23 }, + { INTEL_FAM6_HASWELL_X, 0x02, 0x3b }, + { INTEL_FAM6_HASWELL_X, 0x04, 0x10 }, + { INTEL_FAM6_IVYBRIDGE_X, 0x04, 0x42a }, + /* Updated in the 20180108 release; blacklist until we know otherwise */ + { INTEL_FAM6_ATOM_GEMINI_LAKE, 0x01, 0x22 }, + /* Observed in the wild */ + { INTEL_FAM6_SANDYBRIDGE_X, 0x06, 0x61b }, + { INTEL_FAM6_SANDYBRIDGE_X, 0x07, 0x712 }, +}; + +static bool bad_spectre_microcode(struct cpuinfo_x86 *c) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(spectre_bad_microcodes); i++) { + if (c->x86_model == spectre_bad_microcodes[i].model && + c->x86_mask == spectre_bad_microcodes[i].stepping) + return (c->microcode <= spectre_bad_microcodes[i].microcode); + } + return false; +} + static void early_init_intel(struct cpuinfo_x86 *c) { u64 misc_enable; @@ -51,6 +105,19 @@ static void early_init_intel(struct cpuinfo_x86 *c) rdmsr(MSR_IA32_UCODE_REV, lower_word, c->microcode); }
+ if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) || + cpu_has(c, X86_FEATURE_STIBP) || + cpu_has(c, X86_FEATURE_AMD_SPEC_CTRL) || + cpu_has(c, X86_FEATURE_AMD_PRED_CMD) || + cpu_has(c, X86_FEATURE_AMD_STIBP)) && bad_spectre_microcode(c)) { + pr_warn("Intel Spectre v2 broken microcode detected; disabling SPEC_CTRL\n"); + clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL); + clear_cpu_cap(c, X86_FEATURE_STIBP); + clear_cpu_cap(c, X86_FEATURE_AMD_SPEC_CTRL); + clear_cpu_cap(c, X86_FEATURE_AMD_PRED_CMD); + clear_cpu_cap(c, X86_FEATURE_AMD_STIBP); + } + /* * Atom erratum AAE44/AAF40/AAG38/AAH41: *
From: David Woodhouse dwmw@amazon.co.uk
(cherry picked from commit 20ffa1caecca4db8f79fe665acdeaa5af815a24d)
Expose indirect_branch_prediction_barrier() for use in subsequent patches.
[ tglx: Add IBPB status to spectre_v2 sysfs file ]
Co-developed-by: KarimAllah Ahmed karahmed@amazon.de Signed-off-by: KarimAllah Ahmed karahmed@amazon.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-8-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 2 ++ arch/x86/include/asm/nospec-branch.h | 13 +++++++++++++ arch/x86/kernel/cpu/bugs.c | 10 +++++++++- 3 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index a5671b8..b4e370b 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -201,6 +201,8 @@ /* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */ #define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */
+#define X86_FEATURE_IBPB ( 7*32+21) /* Indirect Branch Prediction Barrier enabled*/ + /* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ #define X86_FEATURE_VNMI ( 8*32+ 1) /* Intel Virtual NMI */ diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 8b91041..41851af 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -194,6 +194,19 @@ static inline void vmexit_fill_RSB(void) #endif }
+static inline void indirect_branch_prediction_barrier(void) +{ + asm volatile(ALTERNATIVE("", + "movl %[msr], %%ecx\n\t" + "movl %[val], %%eax\n\t" + "movl $0, %%edx\n\t" + "wrmsr", + X86_FEATURE_IBPB) + : : [msr] "i" (MSR_IA32_PRED_CMD), + [val] "i" (PRED_CMD_IBPB) + : "eax", "ecx", "edx", "memory"); +} + #endif /* __ASSEMBLY__ */
/* diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 2bbc74f..7def33a 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -296,6 +296,13 @@ retpoline_auto: setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW); pr_info("Filling RSB on context switch\n"); } + + /* Initialize Indirect Branch Prediction Barrier if supported */ + if (boot_cpu_has(X86_FEATURE_SPEC_CTRL) || + boot_cpu_has(X86_FEATURE_AMD_PRED_CMD)) { + setup_force_cpu_cap(X86_FEATURE_IBPB); + pr_info("Enabling Indirect Branch Prediction Barrier\n"); + } }
#undef pr_fmt @@ -325,7 +332,8 @@ ssize_t cpu_show_spectre_v2(struct device *dev, if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) return sprintf(buf, "Not affected\n");
- return sprintf(buf, "%s%s\n", spectre_v2_strings[spectre_v2_enabled], + return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], + boot_cpu_has(X86_FEATURE_IBPB) ? ", IBPB" : "", spectre_v2_module_string()); } #endif
From: David Woodhouse dwmw@amazon.co.uk
(cherry picked from commit 2961298efe1ea1b6fc0d7ee8b76018fa6c0bcef2)
We want to expose the hardware features simply in /proc/cpuinfo as "ibrs", "ibpb" and "stibp". Since AMD has separate CPUID bits for those, use them as the user-visible bits.
When the Intel SPEC_CTRL bit is set which indicates both IBRS and IBPB capability, set those (AMD) bits accordingly. Likewise if the Intel STIBP bit is set, set the AMD STIBP that's used for the generic hardware capability.
Hide the rest from /proc/cpuinfo by putting "" in the comments. Including RETPOLINE and RETPOLINE_AMD which shouldn't be visible there. There are patches to make the sysfs vulnerabilities information non-readable by non-root, and the same should apply to all information about which mitigations are actually in use. Those *shouldn't* appear in /proc/cpuinfo.
The feature bit for whether IBPB is actually used, which is needed for ALTERNATIVEs, is renamed to X86_FEATURE_USE_IBPB.
Originally-by: Borislav Petkov bp@suse.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: ak@linux.intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517070274-12128-2-git-send-email-dwmw@amazon.co.u... Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 18 +++++++++--------- arch/x86/include/asm/nospec-branch.h | 2 +- arch/x86/kernel/cpu/bugs.c | 7 +++---- arch/x86/kernel/cpu/intel.c | 31 +++++++++++++++++++++---------- 4 files changed, 34 insertions(+), 24 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index b4e370b..782005d 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -194,14 +194,14 @@ #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
#define X86_FEATURE_INTEL_PT ( 7*32+15) /* Intel Processor Trace */ -#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* Fill RSB on context switches */ +#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* "" Fill RSB on context switches */
-#define X86_FEATURE_RETPOLINE ( 7*32+29) /* Generic Retpoline mitigation for Spectre variant 2 */ -#define X86_FEATURE_RETPOLINE_AMD ( 7*32+30) /* AMD Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE ( 7*32+29) /* "" Generic Retpoline mitigation for Spectre variant 2 */ +#define X86_FEATURE_RETPOLINE_AMD ( 7*32+30) /* "" AMD Retpoline mitigation for Spectre variant 2 */ /* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */ #define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */
-#define X86_FEATURE_IBPB ( 7*32+21) /* Indirect Branch Prediction Barrier enabled*/ +#define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled*/
/* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ @@ -253,9 +253,9 @@
/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */ #define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */ -#define X86_FEATURE_AMD_PRED_CMD (13*32+12) /* Prediction Command MSR (AMD) */ -#define X86_FEATURE_AMD_SPEC_CTRL (13*32+14) /* Speculation Control MSR only (AMD) */ -#define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors (AMD) */ +#define X86_FEATURE_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */ +#define X86_FEATURE_IBRS (13*32+14) /* Indirect Branch Restricted Speculation */ +#define X86_FEATURE_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors */
/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */ #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */ @@ -293,8 +293,8 @@ /* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */ #define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */ #define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */ -#define X86_FEATURE_SPEC_CTRL (18*32+26) /* Speculation Control (IBRS + IBPB) */ -#define X86_FEATURE_STIBP (18*32+27) /* Single Thread Indirect Branch Predictors */ +#define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ +#define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ #define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */
/* diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 41851af..8dcecb9 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -201,7 +201,7 @@ static inline void indirect_branch_prediction_barrier(void) "movl %[val], %%eax\n\t" "movl $0, %%edx\n\t" "wrmsr", - X86_FEATURE_IBPB) + X86_FEATURE_USE_IBPB) : : [msr] "i" (MSR_IA32_PRED_CMD), [val] "i" (PRED_CMD_IBPB) : "eax", "ecx", "edx", "memory"); diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 7def33a..1968baf 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -298,9 +298,8 @@ retpoline_auto: }
/* Initialize Indirect Branch Prediction Barrier if supported */ - if (boot_cpu_has(X86_FEATURE_SPEC_CTRL) || - boot_cpu_has(X86_FEATURE_AMD_PRED_CMD)) { - setup_force_cpu_cap(X86_FEATURE_IBPB); + if (boot_cpu_has(X86_FEATURE_IBPB)) { + setup_force_cpu_cap(X86_FEATURE_USE_IBPB); pr_info("Enabling Indirect Branch Prediction Barrier\n"); } } @@ -333,7 +332,7 @@ ssize_t cpu_show_spectre_v2(struct device *dev, return sprintf(buf, "Not affected\n");
return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], - boot_cpu_has(X86_FEATURE_IBPB) ? ", IBPB" : "", + boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "", spectre_v2_module_string()); } #endif diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 23ba9cc..fee94ee 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -105,17 +105,28 @@ static void early_init_intel(struct cpuinfo_x86 *c) rdmsr(MSR_IA32_UCODE_REV, lower_word, c->microcode); }
- if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) || - cpu_has(c, X86_FEATURE_STIBP) || - cpu_has(c, X86_FEATURE_AMD_SPEC_CTRL) || - cpu_has(c, X86_FEATURE_AMD_PRED_CMD) || - cpu_has(c, X86_FEATURE_AMD_STIBP)) && bad_spectre_microcode(c)) { - pr_warn("Intel Spectre v2 broken microcode detected; disabling SPEC_CTRL\n"); - clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL); + /* + * The Intel SPEC_CTRL CPUID bit implies IBRS and IBPB support, + * and they also have a different bit for STIBP support. Also, + * a hypervisor might have set the individual AMD bits even on + * Intel CPUs, for finer-grained selection of what's available. + */ + if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) { + set_cpu_cap(c, X86_FEATURE_IBRS); + set_cpu_cap(c, X86_FEATURE_IBPB); + } + if (cpu_has(c, X86_FEATURE_INTEL_STIBP)) + set_cpu_cap(c, X86_FEATURE_STIBP); + + /* Now if any of them are set, check the blacklist and clear the lot */ + if ((cpu_has(c, X86_FEATURE_IBRS) || cpu_has(c, X86_FEATURE_IBPB) || + cpu_has(c, X86_FEATURE_STIBP)) && bad_spectre_microcode(c)) { + pr_warn("Intel Spectre v2 broken microcode detected; disabling Speculation Control\n"); + clear_cpu_cap(c, X86_FEATURE_IBRS); + clear_cpu_cap(c, X86_FEATURE_IBPB); clear_cpu_cap(c, X86_FEATURE_STIBP); - clear_cpu_cap(c, X86_FEATURE_AMD_SPEC_CTRL); - clear_cpu_cap(c, X86_FEATURE_AMD_PRED_CMD); - clear_cpu_cap(c, X86_FEATURE_AMD_STIBP); + clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL); + clear_cpu_cap(c, X86_FEATURE_INTEL_STIBP); }
/*
From: David Woodhouse dwmw@amazon.co.uk
(cherry picked from commit 7fcae1118f5fd44a862aa5c3525248e35ee67c3b)
Despite the fact that all the other code there seems to be doing it, just using set_cpu_cap() in early_intel_init() doesn't actually work.
For CPUs with PKU support, setup_pku() calls get_cpu_cap() after c->c_init() has set those feature bits. That resets those bits back to what was queried from the hardware.
Turning the bits off for bad microcode is easy to fix. That can just use setup_clear_cpu_cap() to force them off for all CPUs.
I was less keen on forcing the feature bits *on* that way, just in case of inconsistencies. I appreciate that the kernel is going to get this utterly wrong if CPU features are not consistent, because it has already applied alternatives by the time secondary CPUs are brought up.
But at least if setup_force_cpu_cap() isn't being used, we might have a chance of *detecting* the lack of the corresponding bit and either panicking or refusing to bring the offending CPU online.
So ensure that the appropriate feature bits are set within get_cpu_cap() regardless of how many extra times it's called.
Fixes: 2961298e ("x86/cpufeatures: Clean up Spectre v2 related CPUID flags") Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: karahmed@amazon.de Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517322623-15261-1-git-send-email-dwmw@amazon.co.u... Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/common.c | 21 +++++++++++++++++++++ arch/x86/kernel/cpu/intel.c | 27 ++++++++------------------- 2 files changed, 29 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index d6c097c..72d7e5a 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -676,6 +676,26 @@ static void apply_forced_caps(struct cpuinfo_x86 *c) } }
+static void init_speculation_control(struct cpuinfo_x86 *c) +{ + /* + * The Intel SPEC_CTRL CPUID bit implies IBRS and IBPB support, + * and they also have a different bit for STIBP support. Also, + * a hypervisor might have set the individual AMD bits even on + * Intel CPUs, for finer-grained selection of what's available. + * + * We use the AMD bits in 0x8000_0008 EBX as the generic hardware + * features, which are visible in /proc/cpuinfo and used by the + * kernel. So set those accordingly from the Intel bits. + */ + if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) { + set_cpu_cap(c, X86_FEATURE_IBRS); + set_cpu_cap(c, X86_FEATURE_IBPB); + } + if (cpu_has(c, X86_FEATURE_INTEL_STIBP)) + set_cpu_cap(c, X86_FEATURE_STIBP); +} + void get_cpu_cap(struct cpuinfo_x86 *c) { u32 eax, ebx, ecx, edx; @@ -768,6 +788,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c) c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
init_scattered_cpuid_features(c); + init_speculation_control(c); }
static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c) diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index fee94ee..0f13189 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -105,28 +105,17 @@ static void early_init_intel(struct cpuinfo_x86 *c) rdmsr(MSR_IA32_UCODE_REV, lower_word, c->microcode); }
- /* - * The Intel SPEC_CTRL CPUID bit implies IBRS and IBPB support, - * and they also have a different bit for STIBP support. Also, - * a hypervisor might have set the individual AMD bits even on - * Intel CPUs, for finer-grained selection of what's available. - */ - if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) { - set_cpu_cap(c, X86_FEATURE_IBRS); - set_cpu_cap(c, X86_FEATURE_IBPB); - } - if (cpu_has(c, X86_FEATURE_INTEL_STIBP)) - set_cpu_cap(c, X86_FEATURE_STIBP); - /* Now if any of them are set, check the blacklist and clear the lot */ - if ((cpu_has(c, X86_FEATURE_IBRS) || cpu_has(c, X86_FEATURE_IBPB) || + if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) || + cpu_has(c, X86_FEATURE_INTEL_STIBP) || + cpu_has(c, X86_FEATURE_IBRS) || cpu_has(c, X86_FEATURE_IBPB) || cpu_has(c, X86_FEATURE_STIBP)) && bad_spectre_microcode(c)) { pr_warn("Intel Spectre v2 broken microcode detected; disabling Speculation Control\n"); - clear_cpu_cap(c, X86_FEATURE_IBRS); - clear_cpu_cap(c, X86_FEATURE_IBPB); - clear_cpu_cap(c, X86_FEATURE_STIBP); - clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL); - clear_cpu_cap(c, X86_FEATURE_INTEL_STIBP); + setup_clear_cpu_cap(X86_FEATURE_IBRS); + setup_clear_cpu_cap(X86_FEATURE_IBPB); + setup_clear_cpu_cap(X86_FEATURE_STIBP); + setup_clear_cpu_cap(X86_FEATURE_SPEC_CTRL); + setup_clear_cpu_cap(X86_FEATURE_INTEL_STIBP); }
/*
From: Arnd Bergmann arnd@arndb.de
(cherry picked from commit 4bf5d56d429cbc96c23d809a08f63cd29e1a702e)
I'm seeing build failures from the two newly introduced arrays that are marked 'const' and '__initdata', which are mutually exclusive:
arch/x86/kernel/cpu/common.c:882:43: error: 'cpu_no_speculation' causes a section type conflict with 'e820_table_firmware_init' arch/x86/kernel/cpu/common.c:895:43: error: 'cpu_no_meltdown' causes a section type conflict with 'e820_table_firmware_init'
The correct annotation is __initconst.
Fixes: fec9434a12f3 ("x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Ricardo Neri ricardo.neri-calderon@linux.intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@suse.de Cc: Thomas Garnier thgarnie@google.com Cc: David Woodhouse dwmw@amazon.co.uk Link: https://lkml.kernel.org/r/20180202213959.611210-1-arnd@arndb.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/common.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 72d7e5a..48499b4 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -817,7 +817,7 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c) #endif }
-static const __initdata struct x86_cpu_id cpu_no_speculation[] = { +static const __initconst struct x86_cpu_id cpu_no_speculation[] = { { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY }, { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY }, { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY }, @@ -830,7 +830,7 @@ static const __initdata struct x86_cpu_id cpu_no_speculation[] = { {} };
-static const __initdata struct x86_cpu_id cpu_no_meltdown[] = { +static const __initconst struct x86_cpu_id cpu_no_meltdown[] = { { X86_VENDOR_AMD }, {} };
From: Denys Vlasenko dvlasenk@redhat.com
commit 778843f934e362ed4ed734520f60a44a78a074b4 upstream
Use of a temporary R8 register here seems to be unnecessary.
"push %r8" is a two-byte insn (it needs REX prefix to specify R8), "push $0" is two-byte too. It seems just using the latter would be no worse.
Thus, code had an unnecessary "xorq %r8,%r8" insn. It probably costs nothing in execution time here since we are probably limited by store bandwidth at this point, but still.
Run-tested under QEMU: 32-bit calls still work:
/ # ./test_syscall_vdso32 [RUN] Executing 6-argument 32-bit syscall via VDSO [OK] Arguments are preserved across syscall [NOTE] R11 has changed:0000000000200ed7 - assuming clobbered by SYSRET insn [OK] R8..R15 did not leak kernel data [RUN] Executing 6-argument 32-bit syscall via INT 80 [OK] Arguments are preserved across syscall [OK] R8..R15 did not leak kernel data [RUN] Running tests under ptrace [RUN] Executing 6-argument 32-bit syscall via VDSO [OK] Arguments are preserved across syscall [NOTE] R11 has changed:0000000000200ed7 - assuming clobbered by SYSRET insn [OK] R8..R15 did not leak kernel data [RUN] Executing 6-argument 32-bit syscall via INT 80 [OK] Arguments are preserved across syscall [OK] R8..R15 did not leak kernel data
Signed-off-by: Denys Vlasenko dvlasenk@redhat.com Acked-by: Andy Lutomirski luto@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Frederic Weisbecker fweisbec@gmail.com Cc: H. Peter Anvin hpa@zytor.com Cc: Kees Cook keescook@chromium.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Steven Rostedt rostedt@goodmis.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Will Drewry wad@chromium.org Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1462201010-16846-1-git-send-email-dvlasenk@redhat.c... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/entry/entry_64_compat.S | 45 ++++++++++++++++++-------------------- 1 file changed, 21 insertions(+), 24 deletions(-)
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S index d03bf0e..e479ff8 100644 --- a/arch/x86/entry/entry_64_compat.S +++ b/arch/x86/entry/entry_64_compat.S @@ -79,24 +79,23 @@ ENTRY(entry_SYSENTER_compat) ASM_CLAC /* Clear AC after saving FLAGS */
pushq $__USER32_CS /* pt_regs->cs */ - xorq %r8,%r8 - pushq %r8 /* pt_regs->ip = 0 (placeholder) */ + pushq $0 /* pt_regs->ip = 0 (placeholder) */ pushq %rax /* pt_regs->orig_ax */ pushq %rdi /* pt_regs->di */ pushq %rsi /* pt_regs->si */ pushq %rdx /* pt_regs->dx */ pushq %rcx /* pt_regs->cx */ pushq $-ENOSYS /* pt_regs->ax */ - pushq %r8 /* pt_regs->r8 = 0 */ - pushq %r8 /* pt_regs->r9 = 0 */ - pushq %r8 /* pt_regs->r10 = 0 */ - pushq %r8 /* pt_regs->r11 = 0 */ + pushq $0 /* pt_regs->r8 = 0 */ + pushq $0 /* pt_regs->r9 = 0 */ + pushq $0 /* pt_regs->r10 = 0 */ + pushq $0 /* pt_regs->r11 = 0 */ pushq %rbx /* pt_regs->rbx */ pushq %rbp /* pt_regs->rbp (will be overwritten) */ - pushq %r8 /* pt_regs->r12 = 0 */ - pushq %r8 /* pt_regs->r13 = 0 */ - pushq %r8 /* pt_regs->r14 = 0 */ - pushq %r8 /* pt_regs->r15 = 0 */ + pushq $0 /* pt_regs->r12 = 0 */ + pushq $0 /* pt_regs->r13 = 0 */ + pushq $0 /* pt_regs->r14 = 0 */ + pushq $0 /* pt_regs->r15 = 0 */ cld
/* @@ -185,17 +184,16 @@ ENTRY(entry_SYSCALL_compat) pushq %rdx /* pt_regs->dx */ pushq %rbp /* pt_regs->cx (stashed in bp) */ pushq $-ENOSYS /* pt_regs->ax */ - xorq %r8,%r8 - pushq %r8 /* pt_regs->r8 = 0 */ - pushq %r8 /* pt_regs->r9 = 0 */ - pushq %r8 /* pt_regs->r10 = 0 */ - pushq %r8 /* pt_regs->r11 = 0 */ + pushq $0 /* pt_regs->r8 = 0 */ + pushq $0 /* pt_regs->r9 = 0 */ + pushq $0 /* pt_regs->r10 = 0 */ + pushq $0 /* pt_regs->r11 = 0 */ pushq %rbx /* pt_regs->rbx */ pushq %rbp /* pt_regs->rbp (will be overwritten) */ - pushq %r8 /* pt_regs->r12 = 0 */ - pushq %r8 /* pt_regs->r13 = 0 */ - pushq %r8 /* pt_regs->r14 = 0 */ - pushq %r8 /* pt_regs->r15 = 0 */ + pushq $0 /* pt_regs->r12 = 0 */ + pushq $0 /* pt_regs->r13 = 0 */ + pushq $0 /* pt_regs->r14 = 0 */ + pushq $0 /* pt_regs->r15 = 0 */
/* * User mode is traced as though IRQs are on, and SYSENTER @@ -292,11 +290,10 @@ ENTRY(entry_INT80_compat) pushq %rdx /* pt_regs->dx */ pushq %rcx /* pt_regs->cx */ pushq $-ENOSYS /* pt_regs->ax */ - xorq %r8,%r8 - pushq %r8 /* pt_regs->r8 = 0 */ - pushq %r8 /* pt_regs->r9 = 0 */ - pushq %r8 /* pt_regs->r10 = 0 */ - pushq %r8 /* pt_regs->r11 = 0 */ + pushq $0 /* pt_regs->r8 = 0 */ + pushq $0 /* pt_regs->r9 = 0 */ + pushq $0 /* pt_regs->r10 = 0 */ + pushq $0 /* pt_regs->r11 = 0 */ pushq %rbx /* pt_regs->rbx */ pushq %rbp /* pt_regs->rbp */ pushq %r12 /* pt_regs->r12 */
From: Dan Williams dan.j.williams@intel.com
commit 6b8cf5cc9965673951f1ab3f0e3cf23d06e3e2ee upstream.
At entry userspace may have populated registers with values that could otherwise be useful in a speculative execution attack. Clear them to minimize the kernel's attack surface.
Originally-From: Andi Kleen ak@linux.intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Cc: stable@vger.kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/151787989697.7847.4083702787288600552.stgit@dwillia... [ Made small improvements to the changelog. ] Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/entry/entry_64_compat.S | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S index e479ff8..48c27c3 100644 --- a/arch/x86/entry/entry_64_compat.S +++ b/arch/x86/entry/entry_64_compat.S @@ -87,15 +87,25 @@ ENTRY(entry_SYSENTER_compat) pushq %rcx /* pt_regs->cx */ pushq $-ENOSYS /* pt_regs->ax */ pushq $0 /* pt_regs->r8 = 0 */ + xorq %r8, %r8 /* nospec r8 */ pushq $0 /* pt_regs->r9 = 0 */ + xorq %r9, %r9 /* nospec r9 */ pushq $0 /* pt_regs->r10 = 0 */ + xorq %r10, %r10 /* nospec r10 */ pushq $0 /* pt_regs->r11 = 0 */ + xorq %r11, %r11 /* nospec r11 */ pushq %rbx /* pt_regs->rbx */ + xorl %ebx, %ebx /* nospec rbx */ pushq %rbp /* pt_regs->rbp (will be overwritten) */ + xorl %ebp, %ebp /* nospec rbp */ pushq $0 /* pt_regs->r12 = 0 */ + xorq %r12, %r12 /* nospec r12 */ pushq $0 /* pt_regs->r13 = 0 */ + xorq %r13, %r13 /* nospec r13 */ pushq $0 /* pt_regs->r14 = 0 */ + xorq %r14, %r14 /* nospec r14 */ pushq $0 /* pt_regs->r15 = 0 */ + xorq %r15, %r15 /* nospec r15 */ cld
/* @@ -185,15 +195,25 @@ ENTRY(entry_SYSCALL_compat) pushq %rbp /* pt_regs->cx (stashed in bp) */ pushq $-ENOSYS /* pt_regs->ax */ pushq $0 /* pt_regs->r8 = 0 */ + xorq %r8, %r8 /* nospec r8 */ pushq $0 /* pt_regs->r9 = 0 */ + xorq %r9, %r9 /* nospec r9 */ pushq $0 /* pt_regs->r10 = 0 */ + xorq %r10, %r10 /* nospec r10 */ pushq $0 /* pt_regs->r11 = 0 */ + xorq %r11, %r11 /* nospec r11 */ pushq %rbx /* pt_regs->rbx */ + xorl %ebx, %ebx /* nospec rbx */ pushq %rbp /* pt_regs->rbp (will be overwritten) */ + xorl %ebp, %ebp /* nospec rbp */ pushq $0 /* pt_regs->r12 = 0 */ + xorq %r12, %r12 /* nospec r12 */ pushq $0 /* pt_regs->r13 = 0 */ + xorq %r13, %r13 /* nospec r13 */ pushq $0 /* pt_regs->r14 = 0 */ + xorq %r14, %r14 /* nospec r14 */ pushq $0 /* pt_regs->r15 = 0 */ + xorq %r15, %r15 /* nospec r15 */
/* * User mode is traced as though IRQs are on, and SYSENTER @@ -291,15 +311,25 @@ ENTRY(entry_INT80_compat) pushq %rcx /* pt_regs->cx */ pushq $-ENOSYS /* pt_regs->ax */ pushq $0 /* pt_regs->r8 = 0 */ + xorq %r8, %r8 /* nospec r8 */ pushq $0 /* pt_regs->r9 = 0 */ + xorq %r9, %r9 /* nospec r9 */ pushq $0 /* pt_regs->r10 = 0 */ + xorq %r10, %r10 /* nospec r10 */ pushq $0 /* pt_regs->r11 = 0 */ + xorq %r11, %r11 /* nospec r11 */ pushq %rbx /* pt_regs->rbx */ + xorl %ebx, %ebx /* nospec rbx */ pushq %rbp /* pt_regs->rbp */ + xorl %ebp, %ebp /* nospec rbp */ pushq %r12 /* pt_regs->r12 */ + xorq %r12, %r12 /* nospec r12 */ pushq %r13 /* pt_regs->r13 */ + xorq %r13, %r13 /* nospec r13 */ pushq %r14 /* pt_regs->r14 */ + xorq %r14, %r14 /* nospec r14 */ pushq %r15 /* pt_regs->r15 */ + xorq %r15, %r15 /* nospec r15 */ cld
/*
From: David Woodhouse dwmw@amazon.co.uk
commit 1751342095f0d2b36fa8114d8e12c5688c455ac4 upstream.
Intel have retroactively blessed the 0xc2 microcode on Skylake mobile and desktop parts, and the Gemini Lake 0x22 microcode is apparently fine too. We blacklisted the latter purely because it was present with all the other problematic ones in the 2018-01-08 release, but now it's explicitly listed as OK.
We still list 0x84 for the various Kaby Lake / Coffee Lake parts, as that appeared in one version of the blacklist and then reverted to 0x80 again. We can change it if 0x84 is actually announced to be safe.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Cc: Andy Lutomirski luto@kernel.org Cc: Arjan van de Ven arjan@linux.intel.com Cc: Borislav Petkov bp@alien8.de Cc: Dan Williams dan.j.williams@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Woodhouse dwmw2@infradead.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: arjan.van.de.ven@intel.com Cc: jmattson@google.com Cc: karahmed@amazon.de Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Cc: sironi@amazon.de Link: http://lkml.kernel.org/r/1518305967-31356-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/intel.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 0f13189..71492d2 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -47,8 +47,6 @@ static const struct sku_microcode spectre_bad_microcodes[] = { { INTEL_FAM6_KABYLAKE_MOBILE, 0x09, 0x84 }, { INTEL_FAM6_SKYLAKE_X, 0x03, 0x0100013e }, { INTEL_FAM6_SKYLAKE_X, 0x04, 0x0200003c }, - { INTEL_FAM6_SKYLAKE_MOBILE, 0x03, 0xc2 }, - { INTEL_FAM6_SKYLAKE_DESKTOP, 0x03, 0xc2 }, { INTEL_FAM6_BROADWELL_CORE, 0x04, 0x28 }, { INTEL_FAM6_BROADWELL_GT3E, 0x01, 0x1b }, { INTEL_FAM6_BROADWELL_XEON_D, 0x02, 0x14 }, @@ -60,8 +58,6 @@ static const struct sku_microcode spectre_bad_microcodes[] = { { INTEL_FAM6_HASWELL_X, 0x02, 0x3b }, { INTEL_FAM6_HASWELL_X, 0x04, 0x10 }, { INTEL_FAM6_IVYBRIDGE_X, 0x04, 0x42a }, - /* Updated in the 20180108 release; blacklist until we know otherwise */ - { INTEL_FAM6_ATOM_GEMINI_LAKE, 0x01, 0x22 }, /* Observed in the wild */ { INTEL_FAM6_SANDYBRIDGE_X, 0x06, 0x61b }, { INTEL_FAM6_SANDYBRIDGE_X, 0x07, 0x712 },
From: David Woodhouse dwmw@amazon.co.uk
commit d37fc6d360a404b208547ba112e7dabb6533c7fc upstream.
Arjan points out that the Intel document only clears the 0xc2 microcode on *some* parts with CPUID 506E3 (INTEL_FAM6_SKYLAKE_DESKTOP stepping 3). For the Skylake H/S platform it's OK but for Skylake E3 which has the same CPUID it isn't (yet) cleared.
So removing it from the blacklist was premature. Put it back for now.
Also, Arjan assures me that the 0x84 microcode for Kaby Lake which was featured in one of the early revisions of the Intel document was never released to the public, and won't be until/unless it is also validated as safe. So those can change to 0x80 which is what all *other* versions of the doc have identified.
Once the retrospective testing of existing public microcodes is done, we should be back into a mode where new microcodes are only released in batches and we shouldn't even need to update the blacklist for those anyway, so this tweaking of the list isn't expected to be a thing which keeps happening.
Requested-by: Arjan van de Ven arjan.van.de.ven@intel.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Cc: Andy Lutomirski luto@kernel.org Cc: Arjan van de Ven arjan@linux.intel.com Cc: Borislav Petkov bp@alien8.de Cc: Dan Williams dan.j.williams@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Woodhouse dwmw2@infradead.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: arjan.van.de.ven@intel.com Cc: dave.hansen@intel.com Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Link: http://lkml.kernel.org/r/1518449255-2182-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/intel.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 71492d2..b69d258 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -40,13 +40,14 @@ struct sku_microcode { u32 microcode; }; static const struct sku_microcode spectre_bad_microcodes[] = { - { INTEL_FAM6_KABYLAKE_DESKTOP, 0x0B, 0x84 }, - { INTEL_FAM6_KABYLAKE_DESKTOP, 0x0A, 0x84 }, - { INTEL_FAM6_KABYLAKE_DESKTOP, 0x09, 0x84 }, - { INTEL_FAM6_KABYLAKE_MOBILE, 0x0A, 0x84 }, - { INTEL_FAM6_KABYLAKE_MOBILE, 0x09, 0x84 }, + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x0B, 0x80 }, + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x0A, 0x80 }, + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x09, 0x80 }, + { INTEL_FAM6_KABYLAKE_MOBILE, 0x0A, 0x80 }, + { INTEL_FAM6_KABYLAKE_MOBILE, 0x09, 0x80 }, { INTEL_FAM6_SKYLAKE_X, 0x03, 0x0100013e }, { INTEL_FAM6_SKYLAKE_X, 0x04, 0x0200003c }, + { INTEL_FAM6_SKYLAKE_DESKTOP, 0x03, 0xc2 }, { INTEL_FAM6_BROADWELL_CORE, 0x04, 0x28 }, { INTEL_FAM6_BROADWELL_GT3E, 0x01, 0x1b }, { INTEL_FAM6_BROADWELL_XEON_D, 0x02, 0x14 },
From: Ingo Molnar mingo@kernel.org
commit 21e433bdb95bdf3aa48226fd3d33af608437f293 upstream.
Harmonize all the Spectre messages so that a:
dmesg | grep -i spectre
... gives us most Spectre related kernel boot messages.
Also fix a few other details:
- clarify a comment about firmware speculation control
- s/KPTI/PTI
- remove various line-breaks that made the code uglier
Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: Andy Lutomirski luto@kernel.org Cc: Arjan van de Ven arjan@linux.intel.com Cc: Borislav Petkov bp@alien8.de Cc: Dan Williams dan.j.williams@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: David Woodhouse dwmw2@infradead.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/bugs.c | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 1968baf..fea368d 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -162,8 +162,7 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) if (cmdline_find_option_bool(boot_command_line, "nospectre_v2")) return SPECTRE_V2_CMD_NONE; else { - ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, - sizeof(arg)); + ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg)); if (ret < 0) return SPECTRE_V2_CMD_AUTO;
@@ -184,8 +183,7 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) cmd == SPECTRE_V2_CMD_RETPOLINE_AMD || cmd == SPECTRE_V2_CMD_RETPOLINE_GENERIC) && !IS_ENABLED(CONFIG_RETPOLINE)) { - pr_err("%s selected but not compiled in. Switching to AUTO select\n", - mitigation_options[i].option); + pr_err("%s selected but not compiled in. Switching to AUTO select\n", mitigation_options[i].option); return SPECTRE_V2_CMD_AUTO; }
@@ -255,14 +253,14 @@ static void __init spectre_v2_select_mitigation(void) goto retpoline_auto; break; } - pr_err("kernel not compiled with retpoline; no mitigation available!"); + pr_err("Spectre mitigation: kernel not compiled with retpoline; no mitigation available!"); return;
retpoline_auto: if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { retpoline_amd: if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) { - pr_err("LFENCE not serializing. Switching to generic retpoline\n"); + pr_err("Spectre mitigation: LFENCE not serializing, switching to generic retpoline\n"); goto retpoline_generic; } mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD : @@ -280,7 +278,7 @@ retpoline_auto: pr_info("%s\n", spectre_v2_strings[mode]);
/* - * If neither SMEP or KPTI are available, there is a risk of + * If neither SMEP nor PTI are available, there is a risk of * hitting userspace addresses in the RSB after a context switch * from a shallow call stack to a deeper one. To prevent this fill * the entire RSB, even when using IBRS. @@ -294,21 +292,20 @@ retpoline_auto: if ((!boot_cpu_has(X86_FEATURE_KAISER) && !boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) { setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW); - pr_info("Filling RSB on context switch\n"); + pr_info("Spectre v2 mitigation: Filling RSB on context switch\n"); }
/* Initialize Indirect Branch Prediction Barrier if supported */ if (boot_cpu_has(X86_FEATURE_IBPB)) { setup_force_cpu_cap(X86_FEATURE_USE_IBPB); - pr_info("Enabling Indirect Branch Prediction Barrier\n"); + pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n"); } }
#undef pr_fmt
#ifdef CONFIG_SYSFS -ssize_t cpu_show_meltdown(struct device *dev, - struct device_attribute *attr, char *buf) +ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf) { if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN)) return sprintf(buf, "Not affected\n"); @@ -317,16 +314,14 @@ ssize_t cpu_show_meltdown(struct device *dev, return sprintf(buf, "Vulnerable\n"); }
-ssize_t cpu_show_spectre_v1(struct device *dev, - struct device_attribute *attr, char *buf) +ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf) { if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1)) return sprintf(buf, "Not affected\n"); return sprintf(buf, "Mitigation: __user pointer sanitization\n"); }
-ssize_t cpu_show_spectre_v2(struct device *dev, - struct device_attribute *attr, char *buf) +ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, char *buf) { if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) return sprintf(buf, "Not affected\n");
From: Dan Williams dan.j.williams@intel.com
commit be3233fbfcb8f5acb6e3bcd0895c3ef9e100d470 upstream.
Allow the compiler to handle @size as an immediate value or memory directly rather than allocating a register.
Reported-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Dan Williams dan.j.williams@intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/151797010204.1289.1510000292250184993.stgit@dwillia... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/barrier.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index e3a6f66..7f5dcb6 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -40,7 +40,7 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
asm volatile ("cmp %1,%2; sbb %0,%0;" :"=r" (mask) - :"r"(size),"r" (index) + :"g"(size),"r" (index) :"cc"); return mask; }
From: Peter Zijlstra peterz@infradead.org
commit ea00f301285ea2f07393678cd2b6057878320c9d upstream.
Joe Konno reported a compile failure resulting from using an MSR without inclusion of <asm/msr-index.h>, and while the current code builds fine (by accident) this needs fixing for future patches.
Reported-by: Joe Konno joe.konno@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: arjan@linux.intel.com Cc: bp@alien8.de Cc: dan.j.williams@intel.com Cc: dave.hansen@linux.intel.com Cc: dwmw2@infradead.org Cc: dwmw@amazon.co.uk Cc: gregkh@linuxfoundation.org Cc: hpa@zytor.com Cc: jpoimboe@redhat.com Cc: linux-tip-commits@vger.kernel.org Cc: luto@kernel.org Fixes: 20ffa1caecca ("x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support") Link: http://lkml.kernel.org/r/20180213132819.GJ25201@hirez.programming.kicks-ass.... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/nospec-branch.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 8dcecb9..bca2860 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -6,6 +6,7 @@ #include <asm/alternative.h> #include <asm/alternative-asm.h> #include <asm/cpufeatures.h> +#include <asm/msr-index.h>
/* * Fill the CPU return stack buffer.
From: Juergen Gross jgross@suse.com
commit 71c208dd54ab971036d83ff6d9837bae4976e623 upstream.
Older Xen versions (4.5 and before) might have problems migrating pv guests with MSR_IA32_SPEC_CTRL having a non-zero value. So before suspending zero that MSR and restore it after being resumed.
Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Cc: xen-devel@lists.xenproject.org Cc: boris.ostrovsky@oracle.com Link: https://lkml.kernel.org/r/20180226140818.4849-1-jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/xen/suspend.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c index 7f664c4..4ecd0de 100644 --- a/arch/x86/xen/suspend.c +++ b/arch/x86/xen/suspend.c @@ -1,11 +1,14 @@ #include <linux/types.h> #include <linux/tick.h> +#include <linux/percpu-defs.h>
#include <xen/xen.h> #include <xen/interface/xen.h> #include <xen/grant_table.h> #include <xen/events.h>
+#include <asm/cpufeatures.h> +#include <asm/msr-index.h> #include <asm/xen/hypercall.h> #include <asm/xen/page.h> #include <asm/fixmap.h> @@ -68,6 +71,8 @@ static void xen_pv_post_suspend(int suspend_cancelled) xen_mm_unpin_all(); }
+static DEFINE_PER_CPU(u64, spec_ctrl); + void xen_arch_pre_suspend(void) { if (xen_pv_domain()) @@ -84,6 +89,9 @@ void xen_arch_post_suspend(int cancelled)
static void xen_vcpu_notify_restore(void *data) { + if (xen_pv_domain() && boot_cpu_has(X86_FEATURE_SPEC_CTRL)) + wrmsrl(MSR_IA32_SPEC_CTRL, this_cpu_read(spec_ctrl)); + /* Boot processor notified via generic timekeeping_resume() */ if (smp_processor_id() == 0) return; @@ -93,7 +101,15 @@ static void xen_vcpu_notify_restore(void *data)
static void xen_vcpu_notify_suspend(void *data) { + u64 tmp; + tick_suspend_local(); + + if (xen_pv_domain() && boot_cpu_has(X86_FEATURE_SPEC_CTRL)) { + rdmsrl(MSR_IA32_SPEC_CTRL, tmp); + this_cpu_write(spec_ctrl, tmp); + wrmsrl(MSR_IA32_SPEC_CTRL, 0); + } }
void xen_arch_resume(void)
From: Dave Hansen dave.hansen@linux.intel.com
commit 39a0526fb3f7d93433d146304278477eb463f8af upstream
The arch-specific mm_context_t is a great place to put protection-key allocation state.
But, we need to initialize the allocation state because pkey 0 is always "allocated". All of the runtime initialization of mm_context_t is done in *_ldt() manipulation functions. This renames the existing LDT functions like this:
init_new_context() -> init_new_context_ldt() destroy_context() -> destroy_context_ldt()
and makes init_new_context() and destroy_context() available for generic use.
Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: Andrew Morton akpm@linux-foundation.org Cc: Andy Lutomirski luto@amacapital.net Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Dave Hansen dave@sr71.net Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210234.DB34FCC5@viggo.jf.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/mmu_context.h | 21 ++++++++++++++++----- arch/x86/kernel/ldt.c | 4 ++-- 2 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 9bfc5fd..1c4794f 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -52,15 +52,15 @@ struct ldt_struct { /* * Used for LDT copy/destruction. */ -int init_new_context(struct task_struct *tsk, struct mm_struct *mm); -void destroy_context(struct mm_struct *mm); +int init_new_context_ldt(struct task_struct *tsk, struct mm_struct *mm); +void destroy_context_ldt(struct mm_struct *mm); #else /* CONFIG_MODIFY_LDT_SYSCALL */ -static inline int init_new_context(struct task_struct *tsk, - struct mm_struct *mm) +static inline int init_new_context_ldt(struct task_struct *tsk, + struct mm_struct *mm) { return 0; } -static inline void destroy_context(struct mm_struct *mm) {} +static inline void destroy_context_ldt(struct mm_struct *mm) {} #endif
static inline void load_mm_ldt(struct mm_struct *mm) @@ -102,6 +102,17 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) this_cpu_write(cpu_tlbstate.state, TLBSTATE_LAZY); }
+static inline int init_new_context(struct task_struct *tsk, + struct mm_struct *mm) +{ + init_new_context_ldt(tsk, mm); + return 0; +} +static inline void destroy_context(struct mm_struct *mm) +{ + destroy_context_ldt(mm); +} + extern void switch_mm(struct mm_struct *prev, struct mm_struct *next, struct task_struct *tsk);
diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c index bc42936..8bc68cf 100644 --- a/arch/x86/kernel/ldt.c +++ b/arch/x86/kernel/ldt.c @@ -119,7 +119,7 @@ static void free_ldt_struct(struct ldt_struct *ldt) * we do not have to muck with descriptors here, that is * done in switch_mm() as needed. */ -int init_new_context(struct task_struct *tsk, struct mm_struct *mm) +int init_new_context_ldt(struct task_struct *tsk, struct mm_struct *mm) { struct ldt_struct *new_ldt; struct mm_struct *old_mm; @@ -160,7 +160,7 @@ out_unlock: * * 64bit: Don't touch the LDT register - we're already in the next thread. */ -void destroy_context(struct mm_struct *mm) +void destroy_context_ldt(struct mm_struct *mm) { free_ldt_struct(mm->context.ldt); mm->context.ldt = NULL;
From: Andy Lutomirski luto@kernel.org
commit f39681ed0f48498b80455095376f11535feea332 upstream.
This adds two new variables to mmu_context_t: ctx_id and tlb_gen. ctx_id uniquely identifies the mm_struct and will never be reused. For a given mm_struct (and hence ctx_id), tlb_gen is a monotonic count of the number of times that a TLB flush has been requested. The pair (ctx_id, tlb_gen) can be used as an identifier for TLB flush actions and will be used in subsequent patches to reliably determine whether all needed TLB flushes have occurred on a given CPU.
This patch is split out for ease of review. By itself, it has no real effect other than creating and updating the new variables.
Signed-off-by: Andy Lutomirski luto@kernel.org Reviewed-by: Nadav Amit nadav.amit@gmail.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: Andrew Morton akpm@linux-foundation.org Cc: Arjan van de Ven arjan@linux.intel.com Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@intel.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mel Gorman mgorman@suse.de Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/413a91c24dab3ed0caa5f4e4d017d87b0857f920.1498751203... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Tim Chen tim.c.chen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/mmu.h | 15 +++++++++++++-- arch/x86/include/asm/mmu_context.h | 4 ++++ arch/x86/mm/tlb.c | 2 ++ 3 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 7680b76..3359dfe 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -3,12 +3,18 @@
#include <linux/spinlock.h> #include <linux/mutex.h> +#include <linux/atomic.h>
/* - * The x86 doesn't have a mmu context, but - * we put the segment information here. + * x86 has arch-specific MMU state beyond what lives in mm_struct. */ typedef struct { + /* + * ctx_id uniquely identifies this mm_struct. A ctx_id will never + * be reused, and zero is not a valid ctx_id. + */ + u64 ctx_id; + #ifdef CONFIG_MODIFY_LDT_SYSCALL struct ldt_struct *ldt; #endif @@ -24,6 +30,11 @@ typedef struct { atomic_t perf_rdpmc_allowed; /* nonzero if rdpmc is allowed */ } mm_context_t;
+#define INIT_MM_CONTEXT(mm) \ + .context = { \ + .ctx_id = 1, \ + } + void leave_mm(int cpu);
#endif /* _ASM_X86_MMU_H */ diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 1c4794f..effc127 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -11,6 +11,9 @@ #include <asm/tlbflush.h> #include <asm/paravirt.h> #include <asm/mpx.h> + +extern atomic64_t last_mm_ctx_id; + #ifndef CONFIG_PARAVIRT static inline void paravirt_activate_mm(struct mm_struct *prev, struct mm_struct *next) @@ -105,6 +108,7 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) static inline int init_new_context(struct task_struct *tsk, struct mm_struct *mm) { + mm->context.ctx_id = atomic64_inc_return(&last_mm_ctx_id); init_new_context_ldt(tsk, mm); return 0; } diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 7cad01af..efec198 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -29,6 +29,8 @@ * Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi */
+atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1); + struct flush_tlb_info { struct mm_struct *flush_mm; unsigned long flush_start;
From: Tim Chen tim.c.chen@linux.intel.com
commit 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7 upstream.
Flush indirect branches when switching into a process that marked itself non dumpable. This protects high value processes like gpg better, without having too high performance overhead.
If done naïvely, we could switch to a kernel idle thread and then back to the original process, such as:
process A -> idle -> process A
In such scenario, we do not have to do IBPB here even though the process is non-dumpable, as we are switching back to the same process after a hiatus.
To avoid the redundant IBPB, which is expensive, we track the last mm user context ID. The cost is to have an extra u64 mm context id to track the last mm we were using before switching to the init_mm used by idle. Avoiding the extra IBPB is probably worth the extra memory for this common scenario.
For those cases where tlb_defer_switch_to_init_mm() returns true (non PCID), lazy tlb will defer switch to init_mm, so we will not be changing the mm for the process A -> idle -> process A switch. So IBPB will be skipped for this case.
Thanks to the reviewers and Andy Lutomirski for the suggestion of using ctx_id which got rid of the problem of mm pointer recycling.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: ak@linux.intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: linux@dominikbrodowski.net Cc: peterz@infradead.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/1517263487-3708-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/tlbflush.h | 2 ++ arch/x86/mm/tlb.c | 31 +++++++++++++++++++++++++++++++ 2 files changed, 33 insertions(+)
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index e2a89d2..8ce07db 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -68,6 +68,8 @@ static inline void invpcid_flush_all_nonglobals(void) struct tlb_state { struct mm_struct *active_mm; int state; + /* last user mm's ctx id */ + u64 last_ctx_id;
/* * Access to this CR4 shadow and to H/W CR4 is protected by diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index efec198..6d683bb 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -10,6 +10,7 @@
#include <asm/tlbflush.h> #include <asm/mmu_context.h> +#include <asm/nospec-branch.h> #include <asm/cache.h> #include <asm/apic.h> #include <asm/uv/uv.h> @@ -106,6 +107,36 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, unsigned cpu = smp_processor_id();
if (likely(prev != next)) { + u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id); + + /* + * Avoid user/user BTB poisoning by flushing the branch + * predictor when switching between processes. This stops + * one process from doing Spectre-v2 attacks on another. + * + * As an optimization, flush indirect branches only when + * switching into processes that disable dumping. This + * protects high value processes like gpg, without having + * too high performance overhead. IBPB is *expensive*! + * + * This will not flush branches when switching into kernel + * threads. It will also not flush if we switch to idle + * thread and back to the same process. It will flush if we + * switch to a different non-dumpable process. + */ + if (tsk && tsk->mm && + tsk->mm->context.ctx_id != last_ctx_id && + get_dumpable(tsk->mm) != SUID_DUMP_USER) + indirect_branch_prediction_barrier(); + + /* + * Record last user mm's context id, so we can avoid + * flushing branch buffer with IBPB if we switch back + * to the same user. + */ + if (next != &init_mm) + this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id); + this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK); this_cpu_write(cpu_tlbstate.active_mm, next); cpumask_set_cpu(cpu, mm_cpumask(next));
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 36268223c1e9981d6cfc33aff8520b3bde4b8114 upstream.
As:
1) It's known that hypervisors lie about the environment anyhow (host mismatch)
2) Even if the hypervisor (Xen, KVM, VMWare, etc) provided a valid "correct" value, it all gets to be very murky when migration happens (do you provide the "new" microcode of the machine?).
And in reality the cloud vendors are the ones that should make sure that the microcode that is running is correct and we should just sing lalalala and trust them.
Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Paolo Bonzini pbonzini@redhat.com Cc: Wanpeng Li kernellwp@gmail.com Cc: kvm kvm@vger.kernel.org Cc: Krčmář rkrcmar@redhat.com Cc: Borislav Petkov bp@alien8.de CC: "H. Peter Anvin" hpa@zytor.com CC: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180226213019.GE9497@char.us.oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/intel.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index b69d258..dcc0349 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -68,6 +68,13 @@ static bool bad_spectre_microcode(struct cpuinfo_x86 *c) { int i;
+ /* + * We know that the hypervisor lie to us on the microcode version so + * we may as well hope that it is running the correct version. + */ + if (cpu_has(c, X86_FEATURE_HYPERVISOR)) + return false; + for (i = 0; i < ARRAY_SIZE(spectre_bad_microcodes); i++) { if (c->x86_model == spectre_bad_microcodes[i].model && c->x86_mask == spectre_bad_microcodes[i].stepping)
From: David Woodhouse dwmw@amazon.co.uk
commit dd84441a797150dcc49298ec95c459a8891d8bb1 upstream.
Retpoline means the kernel is safe because it has no indirect branches. But firmware isn't, so use IBRS for firmware calls if it's available.
Block preemption while IBRS is set, although in practice the call sites already had to be doing that.
Ignore hpwdt.c for now. It's taking spinlocks and calling into firmware code, from an NMI handler. I don't want to touch that with a bargepole.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: arjan.van.de.ven@intel.com Cc: bp@alien8.de Cc: dave.hansen@intel.com Cc: jmattson@google.com Cc: karahmed@amazon.de Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Link: http://lkml.kernel.org/r/1519037457-7643-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [ Srivatsa: Backported to 4.4.y, patching the efi_call_virt() family of functions, which are the 4.4.y-equivalents of arch_efi_call_virt_setup()/teardown() ] Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/apm.h | 6 +++++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/efi.h | 7 ++++++ arch/x86/include/asm/nospec-branch.h | 39 ++++++++++++++++++++++++++-------- arch/x86/kernel/cpu/bugs.c | 12 ++++++++++ arch/x86/platform/efi/efi_64.c | 3 +++ 6 files changed, 58 insertions(+), 10 deletions(-)
diff --git a/arch/x86/include/asm/apm.h b/arch/x86/include/asm/apm.h index 20370c6..3d1ec41 100644 --- a/arch/x86/include/asm/apm.h +++ b/arch/x86/include/asm/apm.h @@ -6,6 +6,8 @@ #ifndef _ASM_X86_MACH_DEFAULT_APM_H #define _ASM_X86_MACH_DEFAULT_APM_H
+#include <asm/nospec-branch.h> + #ifdef APM_ZERO_SEGS # define APM_DO_ZERO_SEGS \ "pushl %%ds\n\t" \ @@ -31,6 +33,7 @@ static inline void apm_bios_call_asm(u32 func, u32 ebx_in, u32 ecx_in, * N.B. We do NOT need a cld after the BIOS call * because we always save and restore the flags. */ + firmware_restrict_branch_speculation_start(); __asm__ __volatile__(APM_DO_ZERO_SEGS "pushl %%edi\n\t" "pushl %%ebp\n\t" @@ -43,6 +46,7 @@ static inline void apm_bios_call_asm(u32 func, u32 ebx_in, u32 ecx_in, "=S" (*esi) : "a" (func), "b" (ebx_in), "c" (ecx_in) : "memory", "cc"); + firmware_restrict_branch_speculation_end(); }
static inline u8 apm_bios_call_simple_asm(u32 func, u32 ebx_in, @@ -55,6 +59,7 @@ static inline u8 apm_bios_call_simple_asm(u32 func, u32 ebx_in, * N.B. We do NOT need a cld after the BIOS call * because we always save and restore the flags. */ + firmware_restrict_branch_speculation_start(); __asm__ __volatile__(APM_DO_ZERO_SEGS "pushl %%edi\n\t" "pushl %%ebp\n\t" @@ -67,6 +72,7 @@ static inline u8 apm_bios_call_simple_asm(u32 func, u32 ebx_in, "=S" (si) : "a" (func), "b" (ebx_in), "c" (ecx_in) : "memory", "cc"); + firmware_restrict_branch_speculation_end(); return error; }
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 782005d..bc76bf3 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -202,6 +202,7 @@ #define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */
#define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled*/ +#define X86_FEATURE_USE_IBRS_FW ( 7*32+22) /* "" Use IBRS during runtime firmware calls */
/* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h index 0010c78..7e5a2ff 100644 --- a/arch/x86/include/asm/efi.h +++ b/arch/x86/include/asm/efi.h @@ -3,6 +3,7 @@
#include <asm/fpu/api.h> #include <asm/pgtable.h> +#include <asm/nospec-branch.h>
/* * We map the EFI regions needed for runtime services non-contiguously, @@ -39,8 +40,10 @@ extern unsigned long asmlinkage efi_call_phys(void *, ...); ({ \ efi_status_t __s; \ kernel_fpu_begin(); \ + firmware_restrict_branch_speculation_start(); \ __s = ((efi_##f##_t __attribute__((regparm(0)))*) \ efi.systab->runtime->f)(args); \ + firmware_restrict_branch_speculation_end(); \ kernel_fpu_end(); \ __s; \ }) @@ -49,8 +52,10 @@ extern unsigned long asmlinkage efi_call_phys(void *, ...); #define __efi_call_virt(f, args...) \ ({ \ kernel_fpu_begin(); \ + firmware_restrict_branch_speculation_start(); \ ((efi_##f##_t __attribute__((regparm(0)))*) \ efi.systab->runtime->f)(args); \ + firmware_restrict_branch_speculation_end(); \ kernel_fpu_end(); \ })
@@ -71,7 +76,9 @@ extern u64 asmlinkage efi_call(void *fp, ...); efi_sync_low_kernel_mappings(); \ preempt_disable(); \ __kernel_fpu_begin(); \ + firmware_restrict_branch_speculation_start(); \ __s = efi_call((void *)efi.systab->runtime->f, __VA_ARGS__); \ + firmware_restrict_branch_speculation_end(); \ __kernel_fpu_end(); \ preempt_enable(); \ __s; \ diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index bca2860..36ded24 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -195,17 +195,38 @@ static inline void vmexit_fill_RSB(void) #endif }
+#define alternative_msr_write(_msr, _val, _feature) \ + asm volatile(ALTERNATIVE("", \ + "movl %[msr], %%ecx\n\t" \ + "movl %[val], %%eax\n\t" \ + "movl $0, %%edx\n\t" \ + "wrmsr", \ + _feature) \ + : : [msr] "i" (_msr), [val] "i" (_val) \ + : "eax", "ecx", "edx", "memory") + static inline void indirect_branch_prediction_barrier(void) { - asm volatile(ALTERNATIVE("", - "movl %[msr], %%ecx\n\t" - "movl %[val], %%eax\n\t" - "movl $0, %%edx\n\t" - "wrmsr", - X86_FEATURE_USE_IBPB) - : : [msr] "i" (MSR_IA32_PRED_CMD), - [val] "i" (PRED_CMD_IBPB) - : "eax", "ecx", "edx", "memory"); + alternative_msr_write(MSR_IA32_PRED_CMD, PRED_CMD_IBPB, + X86_FEATURE_USE_IBPB); +} + +/* + * With retpoline, we must use IBRS to restrict branch prediction + * before calling into firmware. + */ +static inline void firmware_restrict_branch_speculation_start(void) +{ + preempt_disable(); + alternative_msr_write(MSR_IA32_SPEC_CTRL, SPEC_CTRL_IBRS, + X86_FEATURE_USE_IBRS_FW); +} + +static inline void firmware_restrict_branch_speculation_end(void) +{ + alternative_msr_write(MSR_IA32_SPEC_CTRL, 0, + X86_FEATURE_USE_IBRS_FW); + preempt_enable(); }
#endif /* __ASSEMBLY__ */ diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index fea368d..b294fdc 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -300,6 +300,15 @@ retpoline_auto: setup_force_cpu_cap(X86_FEATURE_USE_IBPB); pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n"); } + + /* + * Retpoline means the kernel is safe because it has no indirect + * branches. But firmware isn't, so use IBRS to protect that. + */ + if (boot_cpu_has(X86_FEATURE_IBRS)) { + setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW); + pr_info("Enabling Restricted Speculation for firmware calls\n"); + } }
#undef pr_fmt @@ -326,8 +335,9 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) return sprintf(buf, "Not affected\n");
- return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], + return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "", + boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", spectre_v2_module_string()); } #endif diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c index a0ac0f9..f5a8cd9 100644 --- a/arch/x86/platform/efi/efi_64.c +++ b/arch/x86/platform/efi/efi_64.c @@ -40,6 +40,7 @@ #include <asm/fixmap.h> #include <asm/realmode.h> #include <asm/time.h> +#include <asm/nospec-branch.h>
/* * We allocate runtime services regions bottom-up, starting from -4G, i.e. @@ -347,6 +348,7 @@ extern efi_status_t efi64_thunk(u32, ...); \ efi_sync_low_kernel_mappings(); \ local_irq_save(flags); \ + firmware_restrict_branch_speculation_start(); \ \ efi_scratch.prev_cr3 = read_cr3(); \ write_cr3((unsigned long)efi_scratch.efi_pgt); \ @@ -357,6 +359,7 @@ extern efi_status_t efi64_thunk(u32, ...); \ write_cr3(efi_scratch.prev_cr3); \ __flush_tlb_all(); \ + firmware_restrict_branch_speculation_end(); \ local_irq_restore(flags); \ \ __s; \
From: Ingo Molnar mingo@kernel.org
commit d72f4e29e6d84b7ec02ae93088aa459ac70e733b upstream.
firmware_restrict_branch_speculation_*() recently started using preempt_enable()/disable(), but those are relatively high level primitives and cause build failures on some 32-bit builds.
Since we want to keep <asm/nospec-branch.h> low level, convert them to macros to avoid header hell...
Cc: David Woodhouse dwmw@amazon.co.uk Cc: Thomas Gleixner tglx@linutronix.de Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: arjan.van.de.ven@intel.com Cc: bp@alien8.de Cc: dave.hansen@intel.com Cc: jmattson@google.com Cc: karahmed@amazon.de Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/nospec-branch.h | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 36ded24..b9dd1d9 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -214,20 +214,22 @@ static inline void indirect_branch_prediction_barrier(void) /* * With retpoline, we must use IBRS to restrict branch prediction * before calling into firmware. + * + * (Implemented as CPP macros due to header hell.) */ -static inline void firmware_restrict_branch_speculation_start(void) -{ - preempt_disable(); - alternative_msr_write(MSR_IA32_SPEC_CTRL, SPEC_CTRL_IBRS, - X86_FEATURE_USE_IBRS_FW); -} +#define firmware_restrict_branch_speculation_start() \ +do { \ + preempt_disable(); \ + alternative_msr_write(MSR_IA32_SPEC_CTRL, SPEC_CTRL_IBRS, \ + X86_FEATURE_USE_IBRS_FW); \ +} while (0)
-static inline void firmware_restrict_branch_speculation_end(void) -{ - alternative_msr_write(MSR_IA32_SPEC_CTRL, 0, - X86_FEATURE_USE_IBRS_FW); - preempt_enable(); -} +#define firmware_restrict_branch_speculation_end() \ +do { \ + alternative_msr_write(MSR_IA32_SPEC_CTRL, 0, \ + X86_FEATURE_USE_IBRS_FW); \ + preempt_enable(); \ +} while (0)
#endif /* __ASSEMBLY__ */
From: Alexander Sergeyev sergeev917@gmail.com
commit e3b3121fa8da94cb20f9e0c64ab7981ae47fd085 upstream.
In accordance with Intel's microcode revision guidance from March 6 MCU rev 0xc2 is cleared on both Skylake H/S and Skylake Xeon E3 processors that share CPUID 506E3.
Signed-off-by: Alexander Sergeyev sergeev917@gmail.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Jia Zhang qianyue.zj@alibaba-inc.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Kyle Huey me@kylehuey.com Cc: David Woodhouse dwmw@amazon.co.uk Link: https://lkml.kernel.org/r/20180313193856.GA8580@localhost.localdomain Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/intel.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index dcc0349..77d9f68 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -29,7 +29,7 @@ /* * Early microcode releases for the Spectre v2 mitigation were broken. * Information taken from; - * - https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/microcode-upd... + * - https://newsroom.intel.com/wp-content/uploads/sites/11/2018/03/microcode-upd... * - https://kb.vmware.com/s/article/52345 * - Microcode revisions observed in the wild * - Release note from 20180108 microcode release @@ -47,7 +47,6 @@ static const struct sku_microcode spectre_bad_microcodes[] = { { INTEL_FAM6_KABYLAKE_MOBILE, 0x09, 0x80 }, { INTEL_FAM6_SKYLAKE_X, 0x03, 0x0100013e }, { INTEL_FAM6_SKYLAKE_X, 0x04, 0x0200003c }, - { INTEL_FAM6_SKYLAKE_DESKTOP, 0x03, 0xc2 }, { INTEL_FAM6_BROADWELL_CORE, 0x04, 0x28 }, { INTEL_FAM6_BROADWELL_GT3E, 0x01, 0x1b }, { INTEL_FAM6_BROADWELL_XEON_D, 0x02, 0x14 },
From: Mickaël Salaün mic@digikod.net
commit 6c045d07bb305c527140bdec4cf8ab50f7c980d8 upstream
Rename SECCOMP_FLAG_FILTER_TSYNC to SECCOMP_FILTER_FLAG_TSYNC to match the UAPI.
Signed-off-by: Mickaël Salaün mic@digikod.net Cc: Andy Lutomirski luto@amacapital.net Cc: Kees Cook keescook@chromium.org Cc: Shuah Khan shuahkh@osg.samsung.com Cc: Will Drewry wad@chromium.org Acked-by: Kees Cook keescook@chromium.org Signed-off-by: Shuah Khan shuahkh@osg.samsung.com Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
tools/testing/selftests/seccomp/seccomp_bpf.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 882fe83..d446346 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -1476,8 +1476,8 @@ TEST_F(TRACE_syscall, syscall_dropped) #define SECCOMP_SET_MODE_FILTER 1 #endif
-#ifndef SECCOMP_FLAG_FILTER_TSYNC -#define SECCOMP_FLAG_FILTER_TSYNC 1 +#ifndef SECCOMP_FILTER_FLAG_TSYNC +#define SECCOMP_FILTER_FLAG_TSYNC 1 #endif
#ifndef seccomp @@ -1592,7 +1592,7 @@ TEST(TSYNC_first) TH_LOG("Kernel does not support PR_SET_NO_NEW_PRIVS!"); }
- ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FLAG_FILTER_TSYNC, + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, &prog); ASSERT_NE(ENOSYS, errno) { TH_LOG("Kernel does not support seccomp syscall!"); @@ -1810,7 +1810,7 @@ TEST_F(TSYNC, two_siblings_with_ancestor) self->sibling_count++; }
- ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FLAG_FILTER_TSYNC, + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, &self->apply_prog); ASSERT_EQ(0, ret) { TH_LOG("Could install filter on all threads!"); @@ -1871,7 +1871,7 @@ TEST_F(TSYNC, two_siblings_with_no_filter) TH_LOG("Kernel does not support PR_SET_NO_NEW_PRIVS!"); }
- ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FLAG_FILTER_TSYNC, + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, &self->apply_prog); ASSERT_NE(ENOSYS, errno) { TH_LOG("Kernel does not support seccomp syscall!"); @@ -1919,7 +1919,7 @@ TEST_F(TSYNC, two_siblings_with_one_divergence) self->sibling_count++; }
- ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FLAG_FILTER_TSYNC, + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, &self->apply_prog); ASSERT_EQ(self->sibling[0].system_tid, ret) { TH_LOG("Did not fail on diverged sibling."); @@ -1971,7 +1971,7 @@ TEST_F(TSYNC, two_siblings_not_under_filter) TH_LOG("Kernel does not support SECCOMP_SET_MODE_FILTER!"); }
- ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FLAG_FILTER_TSYNC, + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, &self->apply_prog); ASSERT_EQ(ret, self->sibling[0].system_tid) { TH_LOG("Did not fail on diverged sibling."); @@ -2000,7 +2000,7 @@ TEST_F(TSYNC, two_siblings_not_under_filter) /* Switch to the remaining sibling */ sib = !sib;
- ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FLAG_FILTER_TSYNC, + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, &self->apply_prog); ASSERT_EQ(0, ret) { TH_LOG("Expected the remaining sibling to sync"); @@ -2023,7 +2023,7 @@ TEST_F(TSYNC, two_siblings_not_under_filter) while (!kill(self->sibling[sib].system_tid, 0)) sleep(0.1);
- ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FLAG_FILTER_TSYNC, + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, &self->apply_prog); ASSERT_EQ(0, ret); /* just us chickens */ }
From: Mickaël Salaün mic@digikod.net
commit 505ce68c6da3432454c62e43c24a22ea5b1d754b upstream
Signed-off-by: Mickaël Salaün mic@digikod.net Cc: Andy Lutomirski luto@amacapital.net Cc: Kees Cook keescook@chromium.org Cc: Shuah Khan shuahkh@osg.samsung.com Cc: Will Drewry wad@chromium.org Acked-by: Kees Cook keescook@chromium.org Signed-off-by: Shuah Khan shuahkh@osg.samsung.com Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
tools/testing/selftests/seccomp/seccomp_bpf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index d446346..29487e0 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -1481,10 +1481,10 @@ TEST_F(TRACE_syscall, syscall_dropped) #endif
#ifndef seccomp -int seccomp(unsigned int op, unsigned int flags, struct sock_fprog *filter) +int seccomp(unsigned int op, unsigned int flags, void *args) { errno = 0; - return syscall(__NR_seccomp, op, flags, filter); + return syscall(__NR_seccomp, op, flags, args); } #endif
From: Juergen Gross jgross@suse.com
Upstream commit: 0808e80cb760de2733c0527d2090ed2205a1eef8 ("xen: set cpu capabilities from xen_start_kernel()")
There is no need to set the same capabilities for each cpu individually. This can easily be done for all cpus when starting the kernel.
Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/xen/enlighten.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index cbef64b..2d7ab4e 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -460,6 +460,14 @@ static void __init xen_init_cpuid_mask(void) cpuid_leaf1_ecx_set_mask = (1 << (X86_FEATURE_MWAIT % 32)); }
+static void __init xen_init_capabilities(void) +{ + if (xen_pv_domain()) { + setup_clear_cpu_cap(X86_BUG_SYSRET_SS_ATTRS); + setup_force_cpu_cap(X86_FEATURE_XENPV); + } +} + static void xen_set_debugreg(int reg, unsigned long val) { HYPERVISOR_set_debugreg(reg, val); @@ -1587,6 +1595,7 @@ asmlinkage __visible void __init xen_start_kernel(void)
xen_init_irq_ops(); xen_init_cpuid_mask(); + xen_init_capabilities();
#ifdef CONFIG_X86_LOCAL_APIC /* @@ -1883,14 +1892,6 @@ bool xen_hvm_need_lapic(void) } EXPORT_SYMBOL_GPL(xen_hvm_need_lapic);
-static void xen_set_cpu_features(struct cpuinfo_x86 *c) -{ - if (xen_pv_domain()) { - clear_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS); - set_cpu_cap(c, X86_FEATURE_XENPV); - } -} - const struct hypervisor_x86 x86_hyper_xen = { .name = "Xen", .detect = xen_platform, @@ -1898,7 +1899,6 @@ const struct hypervisor_x86 x86_hyper_xen = { .init_platform = xen_hvm_guest_init, #endif .x2apic_available = xen_x2apic_para_available, - .set_cpu_features = xen_set_cpu_features, }; EXPORT_SYMBOL(x86_hyper_xen);
From: David Woodhouse dwmw@amazon.co.uk
commit def9331a12977770cc6132d79f8e6565871e8e38 upstream
When running as Xen pv guest X86_BUG_SYSRET_SS_ATTRS must not be set on AMD cpus.
This bug/feature bit is kind of special as it will be used very early when switching threads. Setting the bit and clearing it a little bit later leaves a critical window where things can go wrong. This time window has enlarged a little bit by using setup_clear_cpu_cap() instead of the hypervisor's set_cpu_features callback. It seems this larger window now makes it rather easy to hit the problem.
The proper solution is to never set the bit in case of Xen.
Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Acked-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/amd.c | 5 +++-- arch/x86/xen/enlighten.c | 4 +--- 2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index f4fb8f5..9b29414 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -791,8 +791,9 @@ static void init_amd(struct cpuinfo_x86 *c) if (cpu_has(c, X86_FEATURE_3DNOW) || cpu_has(c, X86_FEATURE_LM)) set_cpu_cap(c, X86_FEATURE_3DNOWPREFETCH);
- /* AMD CPUs don't reset SS attributes on SYSRET */ - set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS); + /* AMD CPUs don't reset SS attributes on SYSRET, Xen does. */ + if (!cpu_has(c, X86_FEATURE_XENPV)) + set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS); }
#ifdef CONFIG_X86_32 diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 2d7ab4e..82fd84d 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -462,10 +462,8 @@ static void __init xen_init_cpuid_mask(void)
static void __init xen_init_capabilities(void) { - if (xen_pv_domain()) { - setup_clear_cpu_cap(X86_BUG_SYSRET_SS_ATTRS); + if (xen_pv_domain()) setup_force_cpu_cap(X86_FEATURE_XENPV); - } }
static void xen_set_debugreg(int reg, unsigned long val)
From: Linus Torvalds torvalds@linux-foundation.org
commit 1aa7a5735a41418d8e01fa7c9565eb2657e2ea3f upstream
The macro is not type safe and I did look for why that "g" constraint for the asm doesn't work: it's because the asm is more fundamentally wrong.
It does
movl %[val], %%eax
but "val" isn't a 32-bit value, so then gcc will pass it in a register, and generate code like
movl %rsi, %eax
and gas will complain about a nonsensical 'mov' instruction (it's moving a 64-bit register to a 32-bit one).
Passing it through memory will just hide the real bug - gcc still thinks the memory location is 64-bit, but the "movl" will only load the first 32 bits and it all happens to work because x86 is little-endian.
Convert it to a type safe inline function with a little trick which hands the feature into the ALTERNATIVE macro.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/nospec-branch.h | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index b9dd1d9..6403016 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -195,15 +195,16 @@ static inline void vmexit_fill_RSB(void) #endif }
-#define alternative_msr_write(_msr, _val, _feature) \ - asm volatile(ALTERNATIVE("", \ - "movl %[msr], %%ecx\n\t" \ - "movl %[val], %%eax\n\t" \ - "movl $0, %%edx\n\t" \ - "wrmsr", \ - _feature) \ - : : [msr] "i" (_msr), [val] "i" (_val) \ - : "eax", "ecx", "edx", "memory") +static __always_inline +void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature) +{ + asm volatile(ALTERNATIVE("", "wrmsr", %c[feature]) + : : "c" (msr), + "a" (val), + "d" (val >> 32), + [feature] "i" (feature) + : "memory"); +}
static inline void indirect_branch_prediction_barrier(void) {
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 4a28bfe3267b68e22c663ac26185aa16c9b879ef upstream
Combine the various logic which goes through all those x86_cpu_id matching structures in one function.
Suggested-by: Borislav Petkov bp@suse.de Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/common.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 48499b4..97558d1 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -835,21 +835,27 @@ static const __initconst struct x86_cpu_id cpu_no_meltdown[] = { {} };
-static bool __init cpu_vulnerable_to_meltdown(struct cpuinfo_x86 *c) +static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) { u64 ia32_cap = 0;
+ if (x86_match_cpu(cpu_no_speculation)) + return; + + setup_force_cpu_bug(X86_BUG_SPECTRE_V1); + setup_force_cpu_bug(X86_BUG_SPECTRE_V2); + if (x86_match_cpu(cpu_no_meltdown)) - return false; + return;
if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES)) rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
/* Rogue Data Cache Load? No! */ if (ia32_cap & ARCH_CAP_RDCL_NO) - return false; + return;
- return true; + setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); }
/* @@ -898,12 +904,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
setup_force_cpu_cap(X86_FEATURE_ALWAYS);
- if (!x86_match_cpu(cpu_no_speculation)) { - if (cpu_vulnerable_to_meltdown(c)) - setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); - setup_force_cpu_bug(X86_BUG_SPECTRE_V1); - setup_force_cpu_bug(X86_BUG_SPECTRE_V2); - } + cpu_set_bug_bits(c);
fpu__init_system(c);
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit d1059518b4789cabe34bb4b714d07e6089c82ca1 upstream
Those SysFS functions have a similar preamble, as such make common code to handle them.
Suggested-by: Borislav Petkov bp@suse.de Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/bugs.c | 46 +++++++++++++++++++++++++++++++------------- 1 file changed, 32 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index b294fdc..75f3d49 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -314,30 +314,48 @@ retpoline_auto: #undef pr_fmt
#ifdef CONFIG_SYSFS -ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf) + +ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, + char *buf, unsigned int bug) { - if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN)) + if (!boot_cpu_has_bug(bug)) return sprintf(buf, "Not affected\n"); - if (boot_cpu_has(X86_FEATURE_KAISER)) - return sprintf(buf, "Mitigation: PTI\n"); + + switch (bug) { + case X86_BUG_CPU_MELTDOWN: + if (boot_cpu_has(X86_FEATURE_KAISER)) + return sprintf(buf, "Mitigation: PTI\n"); + + break; + + case X86_BUG_SPECTRE_V1: + return sprintf(buf, "Mitigation: __user pointer sanitization\n"); + + case X86_BUG_SPECTRE_V2: + return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], + boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "", + boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", + spectre_v2_module_string()); + + default: + break; + } + return sprintf(buf, "Vulnerable\n"); }
+ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_CPU_MELTDOWN); +} + ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf) { - if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1)) - return sprintf(buf, "Not affected\n"); - return sprintf(buf, "Mitigation: __user pointer sanitization\n"); + return cpu_show_common(dev, attr, buf, X86_BUG_SPECTRE_V1); }
ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, char *buf) { - if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) - return sprintf(buf, "Not affected\n"); - - return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], - boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "", - boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", - spectre_v2_module_string()); + return cpu_show_common(dev, attr, buf, X86_BUG_SPECTRE_V2); } #endif
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 1b86883ccb8d5d9506529d42dbe1a5257cb30b18 upstream
The 336996-Speculative-Execution-Side-Channel-Mitigations.pdf refers to all the other bits as reserved. The Intel SDM glossary defines reserved as implementation specific - aka unknown.
As such at bootup this must be taken it into account and proper masking for the bits in use applied.
A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=199511
[ tglx: Made x86_spec_ctrl_base __ro_after_init ] [ Srivatsa: Removed __ro_after_init for 4.4.y ]
Suggested-by: Jon Masters jcm@redhat.com Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/nospec-branch.h | 24 ++++++++++++++++++++---- arch/x86/kernel/cpu/bugs.c | 27 +++++++++++++++++++++++++++ 2 files changed, 47 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 6403016..daec318 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -172,6 +172,17 @@ enum spectre_v2_mitigation { SPECTRE_V2_IBRS, };
+/* + * The Intel specification for the SPEC_CTRL MSR requires that we + * preserve any already set reserved bits at boot time (e.g. for + * future additions that this kernel is not currently aware of). + * We then set any additional mitigation bits that we want + * ourselves and always use this as the base for SPEC_CTRL. + * We also use this when handling guest entry/exit as below. + */ +extern void x86_spec_ctrl_set(u64); +extern u64 x86_spec_ctrl_get_default(void); + extern char __indirect_thunk_start[]; extern char __indirect_thunk_end[];
@@ -208,8 +219,9 @@ void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature)
static inline void indirect_branch_prediction_barrier(void) { - alternative_msr_write(MSR_IA32_PRED_CMD, PRED_CMD_IBPB, - X86_FEATURE_USE_IBPB); + u64 val = PRED_CMD_IBPB; + + alternative_msr_write(MSR_IA32_PRED_CMD, val, X86_FEATURE_USE_IBPB); }
/* @@ -220,14 +232,18 @@ static inline void indirect_branch_prediction_barrier(void) */ #define firmware_restrict_branch_speculation_start() \ do { \ + u64 val = x86_spec_ctrl_get_default() | SPEC_CTRL_IBRS; \ + \ preempt_disable(); \ - alternative_msr_write(MSR_IA32_SPEC_CTRL, SPEC_CTRL_IBRS, \ + alternative_msr_write(MSR_IA32_SPEC_CTRL, val, \ X86_FEATURE_USE_IBRS_FW); \ } while (0)
#define firmware_restrict_branch_speculation_end() \ do { \ - alternative_msr_write(MSR_IA32_SPEC_CTRL, 0, \ + u64 val = x86_spec_ctrl_get_default(); \ + \ + alternative_msr_write(MSR_IA32_SPEC_CTRL, val, \ X86_FEATURE_USE_IBRS_FW); \ preempt_enable(); \ } while (0) diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 75f3d49..42c2204 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -27,6 +27,12 @@
static void __init spectre_v2_select_mitigation(void);
+/* + * Our boot-time value of the SPEC_CTRL MSR. We read it once so that any + * writes to SPEC_CTRL contain whatever reserved bits have been set. + */ +static u64 x86_spec_ctrl_base; + void __init check_bugs(void) { identify_boot_cpu(); @@ -36,6 +42,13 @@ void __init check_bugs(void) print_cpu_info(&boot_cpu_data); }
+ /* + * Read the SPEC_CTRL MSR to account for reserved bits which may + * have unknown values. + */ + if (boot_cpu_has(X86_FEATURE_IBRS)) + rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); + /* Select the proper spectre mitigation before patching alternatives */ spectre_v2_select_mitigation();
@@ -94,6 +107,20 @@ static const char *spectre_v2_strings[] = {
static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
+void x86_spec_ctrl_set(u64 val) +{ + if (val & ~SPEC_CTRL_IBRS) + WARN_ONCE(1, "SPEC_CTRL MSR value 0x%16llx is unknown.\n", val); + else + wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base | val); +} +EXPORT_SYMBOL_GPL(x86_spec_ctrl_set); + +u64 x86_spec_ctrl_get_default(void) +{ + return x86_spec_ctrl_base; +} +EXPORT_SYMBOL_GPL(x86_spec_ctrl_get_default);
#ifdef RETPOLINE static bool spectre_v2_bad_module;
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 5cf687548705412da47c9cec342fd952d71ed3d5 upstream
A guest may modify the SPEC_CTRL MSR from the value used by the kernel. Since the kernel doesn't use IBRS, this means a value of zero is what is needed in the host.
But the 336996-Speculative-Execution-Side-Channel-Mitigations.pdf refers to the other bits as reserved so the kernel should respect the boot time SPEC_CTRL value and use that.
This allows to deal with future extensions to the SPEC_CTRL interface if any at all.
Note: This uses wrmsrl() instead of native_wrmsl(). I does not make any difference as paravirt will over-write the callq *0xfff.. with the wrmsrl assembler code.
Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [ Srivatsa: Backported to 4.4.y, skipping the KVM changes in this patch. ] Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/nospec-branch.h | 10 ++++++++++ arch/x86/kernel/cpu/bugs.c | 18 ++++++++++++++++++ 2 files changed, 28 insertions(+)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index daec318..11db69a 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -183,6 +183,16 @@ enum spectre_v2_mitigation { extern void x86_spec_ctrl_set(u64); extern u64 x86_spec_ctrl_get_default(void);
+/* + * On VMENTER we must preserve whatever view of the SPEC_CTRL MSR + * the guest has, while on VMEXIT we restore the host view. This + * would be easier if SPEC_CTRL were architecturally maskable or + * shadowable for guests but this is not (currently) the case. + * Takes the guest view of SPEC_CTRL MSR as a parameter. + */ +extern void x86_spec_ctrl_set_guest(u64); +extern void x86_spec_ctrl_restore_host(u64); + extern char __indirect_thunk_start[]; extern char __indirect_thunk_end[];
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 42c2204..e71e281 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -122,6 +122,24 @@ u64 x86_spec_ctrl_get_default(void) } EXPORT_SYMBOL_GPL(x86_spec_ctrl_get_default);
+void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl) +{ + if (!boot_cpu_has(X86_FEATURE_IBRS)) + return; + if (x86_spec_ctrl_base != guest_spec_ctrl) + wrmsrl(MSR_IA32_SPEC_CTRL, guest_spec_ctrl); +} +EXPORT_SYMBOL_GPL(x86_spec_ctrl_set_guest); + +void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl) +{ + if (!boot_cpu_has(X86_FEATURE_IBRS)) + return; + if (x86_spec_ctrl_base != guest_spec_ctrl) + wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); +} +EXPORT_SYMBOL_GPL(x86_spec_ctrl_restore_host); + #ifdef RETPOLINE static bool spectre_v2_bad_module;
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit f5fbf848303c8704d0e1a1e7cabd08fd0a49552f upstream
Merrifield2 is actually Moorefield.
Rename it accordingly and drop tail digit from Merrifield1.
Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20160906184254.94440-1-andriy.shevchenko@linux.inte... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/intel-family.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h index 12fa187..0b27c1e 100644 --- a/arch/x86/include/asm/intel-family.h +++ b/arch/x86/include/asm/intel-family.h @@ -58,8 +58,8 @@ #define INTEL_FAM6_ATOM_SILVERMONT1 0x37 /* BayTrail/BYT / Valleyview */ #define INTEL_FAM6_ATOM_SILVERMONT2 0x4D /* Avaton/Rangely */ #define INTEL_FAM6_ATOM_AIRMONT 0x4C /* CherryTrail / Braswell */ -#define INTEL_FAM6_ATOM_MERRIFIELD1 0x4A /* Tangier */ -#define INTEL_FAM6_ATOM_MERRIFIELD2 0x5A /* Annidale */ +#define INTEL_FAM6_ATOM_MERRIFIELD 0x4A /* Tangier */ +#define INTEL_FAM6_ATOM_MOOREFIELD 0x5A /* Annidale */ #define INTEL_FAM6_ATOM_GOLDMONT 0x5C #define INTEL_FAM6_ATOM_DENVERTON 0x5F /* Goldmont Microserver */ #define INTEL_FAM6_ATOM_GEMINI_LAKE 0x7A
From: Piotr Luc piotr.luc@intel.com
commit 0047f59834e5947d45f34f5f12eb330d158f700b upstream
Add CPUID of Knights Mill (KNM) processor to Intel family list.
Signed-off-by: Piotr Luc piotr.luc@intel.com Reviewed-by: Dave Hansen dave.hansen@intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Brian Gerst brgerst@gmail.com Cc: Denys Vlasenko dvlasenk@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20161012180520.30976-1-piotr.luc@intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/intel-family.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h index 0b27c1e..e13ff5a 100644 --- a/arch/x86/include/asm/intel-family.h +++ b/arch/x86/include/asm/intel-family.h @@ -67,5 +67,6 @@ /* Xeon Phi */
#define INTEL_FAM6_XEON_PHI_KNL 0x57 /* Knights Landing */ +#define INTEL_FAM6_XEON_PHI_KNM 0x85 /* Knights Mill */
#endif /* _ASM_X86_INTEL_FAMILY_H */
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit c456442cd3a59eeb1d60293c26cbe2ff2c4e42cf upstream
Add the sysfs file for the new vulerability. It does not do much except show the words 'Vulnerable' for recent x86 cores.
Intel cores prior to family 6 are known not to be vulnerable, and so are some Atoms and some Xeon Phi.
It assumes that older Cyrix, Centaur, etc. cores are immune.
Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
Documentation/ABI/testing/sysfs-devices-system-cpu | 1 + arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/bugs.c | 5 ++++ arch/x86/kernel/cpu/common.c | 23 ++++++++++++++++++++ drivers/base/cpu.c | 8 +++++++ include/linux/cpu.h | 2 ++ 6 files changed, 40 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index ea6a043..50f9568 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -276,6 +276,7 @@ What: /sys/devices/system/cpu/vulnerabilities /sys/devices/system/cpu/vulnerabilities/meltdown /sys/devices/system/cpu/vulnerabilities/spectre_v1 /sys/devices/system/cpu/vulnerabilities/spectre_v2 + /sys/devices/system/cpu/vulnerabilities/spec_store_bypass Date: January 2018 Contact: Linux kernel mailing list linux-kernel@vger.kernel.org Description: Information about CPU vulnerabilities diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index bc76bf3..08cf4f7 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -315,5 +315,6 @@ #define X86_BUG_CPU_MELTDOWN X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */ #define X86_BUG_SPECTRE_V1 X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */ #define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */ +#define X86_BUG_SPEC_STORE_BYPASS X86_BUG(17) /* CPU is affected by speculative store bypass attack */
#endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index e71e281..0ad13b1 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -403,4 +403,9 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c { return cpu_show_common(dev, attr, buf, X86_BUG_SPECTRE_V2); } + +ssize_t cpu_show_spec_store_bypass(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_SPEC_STORE_BYPASS); +} #endif diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 97558d1..eb78ddf 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -835,10 +835,33 @@ static const __initconst struct x86_cpu_id cpu_no_meltdown[] = { {} };
+static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = { + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT2 }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MERRIFIELD }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_CORE_YONAH }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL }, + { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM }, + { X86_VENDOR_CENTAUR, 5, }, + { X86_VENDOR_INTEL, 5, }, + { X86_VENDOR_NSC, 5, }, + { X86_VENDOR_ANY, 4, }, + {} +}; + static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) { u64 ia32_cap = 0;
+ if (!x86_match_cpu(cpu_no_spec_store_bypass)) + setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS); + if (x86_match_cpu(cpu_no_speculation)) return;
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 3db71af..143edea 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -518,14 +518,22 @@ ssize_t __weak cpu_show_spectre_v2(struct device *dev, return sprintf(buf, "Not affected\n"); }
+ssize_t __weak cpu_show_spec_store_bypass(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Not affected\n"); +} + static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); +static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_meltdown.attr, &dev_attr_spectre_v1.attr, &dev_attr_spectre_v2.attr, + &dev_attr_spec_store_bypass.attr, NULL };
diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 7e04bcd..2f9d120 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -46,6 +46,8 @@ extern ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf); extern ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_spec_store_bypass(struct device *dev, + struct device_attribute *attr, char *buf);
extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata,
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 0cc5fa00b0a88dad140b4e5c2cead9951ad36822 upstream
Add the CPU feature bit CPUID.7.0.EDX[31] which indicates whether the CPU supports Reduced Data Speculation.
[ tglx: Split it out from a later patch ]
Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 08cf4f7..3fce65d 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -297,6 +297,7 @@ #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ #define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */ +#define X86_FEATURE_RDS (18*32+31) /* Reduced Data Speculation */
/* * BUG word(s)
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 24f7fc83b9204d20f878c57cb77d261ae825e033 upstream
Contemporary high performance processors use a common industry-wide optimization known as "Speculative Store Bypass" in which loads from addresses to which a recent store has occurred may (speculatively) see an older value. Intel refers to this feature as "Memory Disambiguation" which is part of their "Smart Memory Access" capability.
Memory Disambiguation can expose a cache side-channel attack against such speculatively read values. An attacker can create exploit code that allows them to read memory outside of a sandbox environment (for example, malicious JavaScript in a web page), or to perform more complex attacks against code running within the same privilege level, e.g. via the stack.
As a first step to mitigate against such attacks, provide two boot command line control knobs:
nospec_store_bypass_disable spec_store_bypass_disable=[off,auto,on]
By default affected x86 processors will power on with Speculative Store Bypass enabled. Hence the provided kernel parameters are written from the point of view of whether to enable a mitigation or not. The parameters are as follows:
- auto - Kernel detects whether your CPU model contains an implementation of Speculative Store Bypass and picks the most appropriate mitigation.
- on - disable Speculative Store Bypass - off - enable Speculative Store Bypass
[ tglx: Reordered the checks so that the whole evaluation is not done when the CPU does not support RDS ]
Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
Documentation/kernel-parameters.txt | 33 +++++++++++ arch/x86/include/asm/cpufeatures.h | 1 arch/x86/include/asm/nospec-branch.h | 6 ++ arch/x86/kernel/cpu/bugs.c | 103 ++++++++++++++++++++++++++++++++++ 4 files changed, 143 insertions(+)
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e60d0b5..dc138b8 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2460,6 +2460,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted. allow data leaks with this option, which is equivalent to spectre_v2=off.
+ nospec_store_bypass_disable + [HW] Disable all mitigations for the Speculative Store Bypass vulnerability + noxsave [BUGS=X86] Disables x86 extended register state save and restore using xsave. The kernel will fallback to enabling legacy floating-point and sse state. @@ -3623,6 +3626,36 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Not specifying this option is equivalent to spectre_v2=auto.
+ spec_store_bypass_disable= + [HW] Control Speculative Store Bypass (SSB) Disable mitigation + (Speculative Store Bypass vulnerability) + + Certain CPUs are vulnerable to an exploit against a + a common industry wide performance optimization known + as "Speculative Store Bypass" in which recent stores + to the same memory location may not be observed by + later loads during speculative execution. The idea + is that such stores are unlikely and that they can + be detected prior to instruction retirement at the + end of a particular speculation execution window. + + In vulnerable processors, the speculatively forwarded + store can be used in a cache side channel attack, for + example to read memory to which the attacker does not + directly have access (e.g. inside sandboxed code). + + This parameter controls whether the Speculative Store + Bypass optimization is used. + + on - Unconditionally disable Speculative Store Bypass + off - Unconditionally enable Speculative Store Bypass + auto - Kernel detects whether the CPU model contains an + implementation of Speculative Store Bypass and + picks the most appropriate mitigation + + Not specifying this option is equivalent to + spec_store_bypass_disable=auto. + spia_io_base= [HW,MTD] spia_fio_base= spia_pedr= diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 3fce65d..9510f5f 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -203,6 +203,7 @@
#define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled*/ #define X86_FEATURE_USE_IBRS_FW ( 7*32+22) /* "" Use IBRS during runtime firmware calls */ +#define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE ( 7*32+23) /* "" Disable Speculative Store Bypass. */
/* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 11db69a..c786d01 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -193,6 +193,12 @@ extern u64 x86_spec_ctrl_get_default(void); extern void x86_spec_ctrl_set_guest(u64); extern void x86_spec_ctrl_restore_host(u64);
+/* The Speculative Store Bypass disable variants */ +enum ssb_mitigation { + SPEC_STORE_BYPASS_NONE, + SPEC_STORE_BYPASS_DISABLE, +}; + extern char __indirect_thunk_start[]; extern char __indirect_thunk_end[];
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 0ad13b1..826aa81 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -26,6 +26,7 @@ #include <asm/intel-family.h>
static void __init spectre_v2_select_mitigation(void); +static void __init ssb_select_mitigation(void);
/* * Our boot-time value of the SPEC_CTRL MSR. We read it once so that any @@ -52,6 +53,12 @@ void __init check_bugs(void) /* Select the proper spectre mitigation before patching alternatives */ spectre_v2_select_mitigation();
+ /* + * Select proper mitigation for any exposure to the Speculative Store + * Bypass vulnerability. + */ + ssb_select_mitigation(); + #ifdef CONFIG_X86_32 /* * Check whether we are able to run this kernel safely on SMP. @@ -357,6 +364,99 @@ retpoline_auto: }
#undef pr_fmt +#define pr_fmt(fmt) "Speculative Store Bypass: " fmt + +static enum ssb_mitigation ssb_mode = SPEC_STORE_BYPASS_NONE; + +/* The kernel command line selection */ +enum ssb_mitigation_cmd { + SPEC_STORE_BYPASS_CMD_NONE, + SPEC_STORE_BYPASS_CMD_AUTO, + SPEC_STORE_BYPASS_CMD_ON, +}; + +static const char *ssb_strings[] = { + [SPEC_STORE_BYPASS_NONE] = "Vulnerable", + [SPEC_STORE_BYPASS_DISABLE] = "Mitigation: Speculative Store Bypass disabled" +}; + +static const struct { + const char *option; + enum ssb_mitigation_cmd cmd; +} ssb_mitigation_options[] = { + { "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */ + { "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */ + { "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */ +}; + +static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void) +{ + enum ssb_mitigation_cmd cmd = SPEC_STORE_BYPASS_CMD_AUTO; + char arg[20]; + int ret, i; + + if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable")) { + return SPEC_STORE_BYPASS_CMD_NONE; + } else { + ret = cmdline_find_option(boot_command_line, "spec_store_bypass_disable", + arg, sizeof(arg)); + if (ret < 0) + return SPEC_STORE_BYPASS_CMD_AUTO; + + for (i = 0; i < ARRAY_SIZE(ssb_mitigation_options); i++) { + if (!match_option(arg, ret, ssb_mitigation_options[i].option)) + continue; + + cmd = ssb_mitigation_options[i].cmd; + break; + } + + if (i >= ARRAY_SIZE(ssb_mitigation_options)) { + pr_err("unknown option (%s). Switching to AUTO select\n", arg); + return SPEC_STORE_BYPASS_CMD_AUTO; + } + } + + return cmd; +} + +static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void) +{ + enum ssb_mitigation mode = SPEC_STORE_BYPASS_NONE; + enum ssb_mitigation_cmd cmd; + + if (!boot_cpu_has(X86_FEATURE_RDS)) + return mode; + + cmd = ssb_parse_cmdline(); + if (!boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS) && + (cmd == SPEC_STORE_BYPASS_CMD_NONE || + cmd == SPEC_STORE_BYPASS_CMD_AUTO)) + return mode; + + switch (cmd) { + case SPEC_STORE_BYPASS_CMD_AUTO: + case SPEC_STORE_BYPASS_CMD_ON: + mode = SPEC_STORE_BYPASS_DISABLE; + break; + case SPEC_STORE_BYPASS_CMD_NONE: + break; + } + + if (mode != SPEC_STORE_BYPASS_NONE) + setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE); + return mode; +} + +static void ssb_select_mitigation() +{ + ssb_mode = __ssb_select_mitigation(); + + if (boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS)) + pr_info("%s\n", ssb_strings[ssb_mode]); +} + +#undef pr_fmt
#ifdef CONFIG_SYSFS
@@ -382,6 +482,9 @@ ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", spectre_v2_module_string());
+ case X86_BUG_SPEC_STORE_BYPASS: + return sprintf(buf, "%s\n", ssb_strings[ssb_mode]); + default: break; }
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 772439717dbf703b39990be58d8d4e3e4ad0598a upstream
Intel CPUs expose methods to:
- Detect whether RDS capability is available via CPUID.7.0.EDX[31],
- The SPEC_CTRL MSR(0x48), bit 2 set to enable RDS.
- MSR_IA32_ARCH_CAPABILITIES, Bit(4) no need to enable RRS.
With that in mind if spec_store_bypass_disable=[auto,on] is selected set at boot-time the SPEC_CTRL MSR to enable RDS if the platform requires it.
Note that this does not fix the KVM case where the SPEC_CTRL is exposed to guests which can muck with it, see patch titled : KVM/SVM/VMX/x86/spectre_v2: Support the combination of guest and host IBRS.
And for the firmware (IBRS to be set), see patch titled: x86/spectre_v2: Read SPEC_CTRL MSR during boot and re-use reserved bits
[ tglx: Distangled it from the intel implementation and kept the call order ]
Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/msr-index.h | 6 ++++++ arch/x86/kernel/cpu/bugs.c | 30 ++++++++++++++++++++++++++++-- arch/x86/kernel/cpu/common.c | 10 ++++++---- arch/x86/kernel/cpu/cpu.h | 3 +++ arch/x86/kernel/cpu/intel.c | 1 + 5 files changed, 44 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index f4701f0..a29edb7 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -35,6 +35,7 @@ #define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */ #define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */ #define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */ +#define SPEC_CTRL_RDS (1 << 2) /* Reduced Data Speculation */
#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */ #define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */ @@ -56,6 +57,11 @@ #define MSR_IA32_ARCH_CAPABILITIES 0x0000010a #define ARCH_CAP_RDCL_NO (1 << 0) /* Not susceptible to Meltdown */ #define ARCH_CAP_IBRS_ALL (1 << 1) /* Enhanced IBRS support */ +#define ARCH_CAP_RDS_NO (1 << 4) /* + * Not susceptible to Speculative Store Bypass + * attack, so no Reduced Data Speculation control + * required. + */
#define MSR_IA32_BBL_CR_CTL 0x00000119 #define MSR_IA32_BBL_CR_CTL3 0x0000011e diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 826aa81..56b84a5 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -116,7 +116,7 @@ static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
void x86_spec_ctrl_set(u64 val) { - if (val & ~SPEC_CTRL_IBRS) + if (val & ~(SPEC_CTRL_IBRS | SPEC_CTRL_RDS)) WARN_ONCE(1, "SPEC_CTRL MSR value 0x%16llx is unknown.\n", val); else wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base | val); @@ -443,8 +443,28 @@ static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void) break; }
- if (mode != SPEC_STORE_BYPASS_NONE) + /* + * We have three CPU feature flags that are in play here: + * - X86_BUG_SPEC_STORE_BYPASS - CPU is susceptible. + * - X86_FEATURE_RDS - CPU is able to turn off speculative store bypass + * - X86_FEATURE_SPEC_STORE_BYPASS_DISABLE - engage the mitigation + */ + if (mode != SPEC_STORE_BYPASS_NONE) { setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE); + /* + * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD uses + * a completely different MSR and bit dependent on family. + */ + switch (boot_cpu_data.x86_vendor) { + case X86_VENDOR_INTEL: + x86_spec_ctrl_base |= SPEC_CTRL_RDS; + x86_spec_ctrl_set(SPEC_CTRL_RDS); + break; + case X86_VENDOR_AMD: + break; + } + } + return mode; }
@@ -458,6 +478,12 @@ static void ssb_select_mitigation()
#undef pr_fmt
+void x86_spec_ctrl_setup_ap(void) +{ + if (boot_cpu_has(X86_FEATURE_IBRS)) + x86_spec_ctrl_set(x86_spec_ctrl_base & (SPEC_CTRL_IBRS | SPEC_CTRL_RDS)); +} + #ifdef CONFIG_SYSFS
ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index eb78ddf..2f1d403 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -859,7 +859,11 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) { u64 ia32_cap = 0;
- if (!x86_match_cpu(cpu_no_spec_store_bypass)) + if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES)) + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap); + + if (!x86_match_cpu(cpu_no_spec_store_bypass) && + !(ia32_cap & ARCH_CAP_RDS_NO)) setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
if (x86_match_cpu(cpu_no_speculation)) @@ -871,9 +875,6 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) if (x86_match_cpu(cpu_no_meltdown)) return;
- if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES)) - rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap); - /* Rogue Data Cache Load? No! */ if (ia32_cap & ARCH_CAP_RDCL_NO) return; @@ -1216,6 +1217,7 @@ void identify_secondary_cpu(struct cpuinfo_x86 *c) enable_sep_cpu(); #endif mtrr_ap_init(); + x86_spec_ctrl_setup_ap(); }
struct msr_range { diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h index 2584265..3b19d82 100644 --- a/arch/x86/kernel/cpu/cpu.h +++ b/arch/x86/kernel/cpu/cpu.h @@ -46,4 +46,7 @@ extern const struct cpu_dev *const __x86_cpu_dev_start[],
extern void get_cpu_cap(struct cpuinfo_x86 *c); extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c); + +extern void x86_spec_ctrl_setup_ap(void); + #endif /* ARCH_X86_CPU_H */ diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 77d9f68..ac25d1e5 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -119,6 +119,7 @@ static void early_init_intel(struct cpuinfo_x86 *c) setup_clear_cpu_cap(X86_FEATURE_STIBP); setup_clear_cpu_cap(X86_FEATURE_SPEC_CTRL); setup_clear_cpu_cap(X86_FEATURE_INTEL_STIBP); + setup_clear_cpu_cap(X86_FEATURE_RDS); }
/*
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 1115a859f33276fe8afb31c60cf9d8e657872558 upstream
Intel and AMD SPEC_CTRL (0x48) MSR semantics may differ in the future (or in fact use different MSRs for the same functionality).
As such a run-time mechanism is required to whitelist the appropriate MSR values.
[ tglx: Made the variable __ro_after_init ] [ Srivatsa: Removed __ro_after_init for 4.4.y ]
Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/bugs.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 56b84a5..c37e211 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -34,6 +34,12 @@ static void __init ssb_select_mitigation(void); */ static u64 x86_spec_ctrl_base;
+/* + * The vendor and possibly platform specific bits which can be modified in + * x86_spec_ctrl_base. + */ +static u64 x86_spec_ctrl_mask = ~SPEC_CTRL_IBRS; + void __init check_bugs(void) { identify_boot_cpu(); @@ -116,7 +122,7 @@ static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
void x86_spec_ctrl_set(u64 val) { - if (val & ~(SPEC_CTRL_IBRS | SPEC_CTRL_RDS)) + if (val & x86_spec_ctrl_mask) WARN_ONCE(1, "SPEC_CTRL MSR value 0x%16llx is unknown.\n", val); else wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base | val); @@ -458,6 +464,7 @@ static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void) switch (boot_cpu_data.x86_vendor) { case X86_VENDOR_INTEL: x86_spec_ctrl_base |= SPEC_CTRL_RDS; + x86_spec_ctrl_mask &= ~SPEC_CTRL_RDS; x86_spec_ctrl_set(SPEC_CTRL_RDS); break; case X86_VENDOR_AMD: @@ -481,7 +488,7 @@ static void ssb_select_mitigation() void x86_spec_ctrl_setup_ap(void) { if (boot_cpu_has(X86_FEATURE_IBRS)) - x86_spec_ctrl_set(x86_spec_ctrl_base & (SPEC_CTRL_IBRS | SPEC_CTRL_RDS)); + x86_spec_ctrl_set(x86_spec_ctrl_base & ~x86_spec_ctrl_mask); }
#ifdef CONFIG_SYSFS
From: David Woodhouse dwmw@amazon.co.uk
commit 764f3c21588a059cd783c6ba0734d4db2d72822d upstream
AMD does not need the Speculative Store Bypass mitigation to be enabled.
The parameters for this are already available and can be done via MSR C001_1020. Each family uses a different bit in that MSR for this.
[ tglx: Expose the bit mask via a variable and move the actual MSR fiddling into the bugs code as that's the right thing to do and also required to prepare for dynamic enable/disable ]
[ Srivatsa: Removed __ro_after_init for 4.4.y ]
Suggested-by: Borislav Petkov bp@suse.de Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/nospec-branch.h | 4 ++++ arch/x86/kernel/cpu/amd.c | 26 ++++++++++++++++++++++++++ arch/x86/kernel/cpu/bugs.c | 27 ++++++++++++++++++++++++++- arch/x86/kernel/cpu/common.c | 4 ++++ 5 files changed, 61 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 9510f5f..b7cdd1c 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -204,6 +204,7 @@ #define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled*/ #define X86_FEATURE_USE_IBRS_FW ( 7*32+22) /* "" Use IBRS during runtime firmware calls */ #define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE ( 7*32+23) /* "" Disable Speculative Store Bypass. */ +#define X86_FEATURE_AMD_RDS (7*32+24) /* "" AMD RDS implementation */
/* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index c786d01..ac2fdc96 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -199,6 +199,10 @@ enum ssb_mitigation { SPEC_STORE_BYPASS_DISABLE, };
+/* AMD specific Speculative Store Bypass MSR data */ +extern u64 x86_amd_ls_cfg_base; +extern u64 x86_amd_ls_cfg_rds_mask; + extern char __indirect_thunk_start[]; extern char __indirect_thunk_end[];
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 9b29414..4452f38 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -9,6 +9,7 @@ #include <asm/processor.h> #include <asm/apic.h> #include <asm/cpu.h> +#include <asm/nospec-branch.h> #include <asm/smp.h> #include <asm/pci-direct.h> #include <asm/delay.h> @@ -519,6 +520,26 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
if (cpu_has(c, X86_FEATURE_MWAITX)) use_mwaitx_delay(); + + if (c->x86 >= 0x15 && c->x86 <= 0x17) { + unsigned int bit; + + switch (c->x86) { + case 0x15: bit = 54; break; + case 0x16: bit = 33; break; + case 0x17: bit = 10; break; + default: return; + } + /* + * Try to cache the base value so further operations can + * avoid RMW. If that faults, do not enable RDS. + */ + if (!rdmsrl_safe(MSR_AMD64_LS_CFG, &x86_amd_ls_cfg_base)) { + setup_force_cpu_cap(X86_FEATURE_RDS); + setup_force_cpu_cap(X86_FEATURE_AMD_RDS); + x86_amd_ls_cfg_rds_mask = 1ULL << bit; + } + } }
static void early_init_amd(struct cpuinfo_x86 *c) @@ -794,6 +815,11 @@ static void init_amd(struct cpuinfo_x86 *c) /* AMD CPUs don't reset SS attributes on SYSRET, Xen does. */ if (!cpu_has(c, X86_FEATURE_XENPV)) set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS); + + if (boot_cpu_has(X86_FEATURE_AMD_RDS)) { + set_cpu_cap(c, X86_FEATURE_RDS); + set_cpu_cap(c, X86_FEATURE_AMD_RDS); + } }
#ifdef CONFIG_X86_32 diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index c37e211..b8911af 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -40,6 +40,13 @@ static u64 x86_spec_ctrl_base; */ static u64 x86_spec_ctrl_mask = ~SPEC_CTRL_IBRS;
+/* + * AMD specific MSR info for Speculative Store Bypass control. + * x86_amd_ls_cfg_rds_mask is initialized in identify_boot_cpu(). + */ +u64 x86_amd_ls_cfg_base; +u64 x86_amd_ls_cfg_rds_mask; + void __init check_bugs(void) { identify_boot_cpu(); @@ -51,7 +58,8 @@ void __init check_bugs(void)
/* * Read the SPEC_CTRL MSR to account for reserved bits which may - * have unknown values. + * have unknown values. AMD64_LS_CFG MSR is cached in the early AMD + * init code as it is not enumerated and depends on the family. */ if (boot_cpu_has(X86_FEATURE_IBRS)) rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); @@ -153,6 +161,14 @@ void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl) } EXPORT_SYMBOL_GPL(x86_spec_ctrl_restore_host);
+static void x86_amd_rds_enable(void) +{ + u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_mask; + + if (boot_cpu_has(X86_FEATURE_AMD_RDS)) + wrmsrl(MSR_AMD64_LS_CFG, msrval); +} + #ifdef RETPOLINE static bool spectre_v2_bad_module;
@@ -442,6 +458,11 @@ static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void)
switch (cmd) { case SPEC_STORE_BYPASS_CMD_AUTO: + /* + * AMD platforms by default don't need SSB mitigation. + */ + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) + break; case SPEC_STORE_BYPASS_CMD_ON: mode = SPEC_STORE_BYPASS_DISABLE; break; @@ -468,6 +489,7 @@ static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void) x86_spec_ctrl_set(SPEC_CTRL_RDS); break; case X86_VENDOR_AMD: + x86_amd_rds_enable(); break; } } @@ -489,6 +511,9 @@ void x86_spec_ctrl_setup_ap(void) { if (boot_cpu_has(X86_FEATURE_IBRS)) x86_spec_ctrl_set(x86_spec_ctrl_base & ~x86_spec_ctrl_mask); + + if (ssb_mode == SPEC_STORE_BYPASS_DISABLE) + x86_amd_rds_enable(); }
#ifdef CONFIG_SYSFS diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 2f1d403..7405c86 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -851,6 +851,10 @@ static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = { { X86_VENDOR_CENTAUR, 5, }, { X86_VENDOR_INTEL, 5, }, { X86_VENDOR_NSC, 5, }, + { X86_VENDOR_AMD, 0x12, }, + { X86_VENDOR_AMD, 0x11, }, + { X86_VENDOR_AMD, 0x10, }, + { X86_VENDOR_AMD, 0xf, }, { X86_VENDOR_ANY, 4, }, {} };
From: Thomas Gleixner tglx@linutronix.de
commit 28a2775217b17208811fa43a9e96bd1fdf417b86 upstream
Having everything in nospec-branch.h creates a hell of dependencies when adding the prctl based switching mechanism. Move everything which is not required in nospec-branch.h to spec-ctrl.h and fix up the includes in the relevant files.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Reviewed-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/nospec-branch.h | 14 -------------- arch/x86/include/asm/spec-ctrl.h | 21 +++++++++++++++++++++ arch/x86/kernel/cpu/amd.c | 2 +- arch/x86/kernel/cpu/bugs.c | 2 +- arch/x86/kvm/svm.c | 2 +- arch/x86/kvm/vmx.c | 2 +- 6 files changed, 25 insertions(+), 18 deletions(-) create mode 100644 arch/x86/include/asm/spec-ctrl.h
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index ac2fdc96..47c454c 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -183,26 +183,12 @@ enum spectre_v2_mitigation { extern void x86_spec_ctrl_set(u64); extern u64 x86_spec_ctrl_get_default(void);
-/* - * On VMENTER we must preserve whatever view of the SPEC_CTRL MSR - * the guest has, while on VMEXIT we restore the host view. This - * would be easier if SPEC_CTRL were architecturally maskable or - * shadowable for guests but this is not (currently) the case. - * Takes the guest view of SPEC_CTRL MSR as a parameter. - */ -extern void x86_spec_ctrl_set_guest(u64); -extern void x86_spec_ctrl_restore_host(u64); - /* The Speculative Store Bypass disable variants */ enum ssb_mitigation { SPEC_STORE_BYPASS_NONE, SPEC_STORE_BYPASS_DISABLE, };
-/* AMD specific Speculative Store Bypass MSR data */ -extern u64 x86_amd_ls_cfg_base; -extern u64 x86_amd_ls_cfg_rds_mask; - extern char __indirect_thunk_start[]; extern char __indirect_thunk_end[];
diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h new file mode 100644 index 0000000..3ad6442 --- /dev/null +++ b/arch/x86/include/asm/spec-ctrl.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_SPECCTRL_H_ +#define _ASM_X86_SPECCTRL_H_ + +#include <asm/nospec-branch.h> + +/* + * On VMENTER we must preserve whatever view of the SPEC_CTRL MSR + * the guest has, while on VMEXIT we restore the host view. This + * would be easier if SPEC_CTRL were architecturally maskable or + * shadowable for guests but this is not (currently) the case. + * Takes the guest view of SPEC_CTRL MSR as a parameter. + */ +extern void x86_spec_ctrl_set_guest(u64); +extern void x86_spec_ctrl_restore_host(u64); + +/* AMD specific Speculative Store Bypass MSR data */ +extern u64 x86_amd_ls_cfg_base; +extern u64 x86_amd_ls_cfg_rds_mask; + +#endif diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 4452f38..14e9849 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -9,7 +9,7 @@ #include <asm/processor.h> #include <asm/apic.h> #include <asm/cpu.h> -#include <asm/nospec-branch.h> +#include <asm/spec-ctrl.h> #include <asm/smp.h> #include <asm/pci-direct.h> #include <asm/delay.h> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index b8911af..47a3cc0 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -12,7 +12,7 @@ #include <linux/cpu.h> #include <linux/module.h>
-#include <asm/nospec-branch.h> +#include <asm/spec-ctrl.h> #include <asm/cmdline.h> #include <asm/bugs.h> #include <asm/processor.h> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 4265437..df7827a 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -37,7 +37,7 @@ #include <asm/desc.h> #include <asm/debugreg.h> #include <asm/kvm_para.h> -#include <asm/nospec-branch.h> +#include <asm/spec-ctrl.h>
#include <asm/virtext.h> #include "trace.h" diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 63c44a9..1814388 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -48,7 +48,7 @@ #include <asm/kexec.h> #include <asm/apic.h> #include <asm/irq_remapping.h> -#include <asm/nospec-branch.h> +#include <asm/spec-ctrl.h>
#include "trace.h" #include "pmu.h"
From: Thomas Gleixner tglx@linutronix.de
commit b617cfc858161140d69cc0b5cc211996b557a1c7 upstream
Add two new prctls to control aspects of speculation related vulnerabilites and their mitigations to provide finer grained control over performance impacting mitigations.
PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature which is selected with arg2 of prctl(2). The return value uses bit 0-2 with the following meaning:
Bit Define Description 0 PR_SPEC_PRCTL Mitigation can be controlled per task by PR_SET_SPECULATION_CTRL 1 PR_SPEC_ENABLE The speculation feature is enabled, mitigation is disabled 2 PR_SPEC_DISABLE The speculation feature is disabled, mitigation is enabled
If all bits are 0 the CPU is not affected by the speculation misfeature.
If PR_SPEC_PRCTL is set, then the per task control of the mitigation is available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation misfeature will fail.
PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which is selected by arg2 of prctl(2) per task. arg3 is used to hand in the control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE.
The common return values are:
EINVAL prctl is not implemented by the architecture or the unused prctl() arguments are not 0 ENODEV arg2 is selecting a not supported speculation misfeature
PR_SET_SPECULATION_CTRL has these additional return values:
ERANGE arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE ENXIO prctl control of the selected speculation misfeature is disabled
The first supported controlable speculation misfeature is PR_SPEC_STORE_BYPASS. Add the define so this can be shared between architectures.
Based on an initial patch from Tim Chen and mostly rewritten.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
Documentation/spec_ctrl.txt | 86 +++++++++++++++++++++++++++++++++++++++++++ include/linux/nospec.h | 5 +++ include/uapi/linux/prctl.h | 11 ++++++ kernel/sys.c | 20 ++++++++++ 4 files changed, 122 insertions(+) create mode 100644 Documentation/spec_ctrl.txt
diff --git a/Documentation/spec_ctrl.txt b/Documentation/spec_ctrl.txt new file mode 100644 index 0000000..ddbebcd --- /dev/null +++ b/Documentation/spec_ctrl.txt @@ -0,0 +1,86 @@ +=================== +Speculation Control +=================== + +Quite some CPUs have speculation related misfeatures which are in fact +vulnerabilites causing data leaks in various forms even accross privilege +domains. + +The kernel provides mitigation for such vulnerabilities in various +forms. Some of these mitigations are compile time configurable and some on +the kernel command line. + +There is also a class of mitigations which are very expensive, but they can +be restricted to a certain set of processes or tasks in controlled +environments. The mechanism to control these mitigations is via +:manpage:`prctl(2)`. + +There are two prctl options which are related to this: + + * PR_GET_SPECULATION_CTRL + + * PR_SET_SPECULATION_CTRL + +PR_GET_SPECULATION_CTRL +----------------------- + +PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature +which is selected with arg2 of prctl(2). The return value uses bits 0-2 with +the following meaning: + +==== ================ =================================================== +Bit Define Description +==== ================ =================================================== +0 PR_SPEC_PRCTL Mitigation can be controlled per task by + PR_SET_SPECULATION_CTRL +1 PR_SPEC_ENABLE The speculation feature is enabled, mitigation is + disabled +2 PR_SPEC_DISABLE The speculation feature is disabled, mitigation is + enabled +==== ================ =================================================== + +If all bits are 0 the CPU is not affected by the speculation misfeature. + +If PR_SPEC_PRCTL is set, then the per task control of the mitigation is +available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation +misfeature will fail. + +PR_SET_SPECULATION_CTRL +----------------------- +PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which +is selected by arg2 of :manpage:`prctl(2)` per task. arg3 is used to hand +in the control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE. + +Common error codes +------------------ +======= ================================================================= +Value Meaning +======= ================================================================= +EINVAL The prctl is not implemented by the architecture or unused + prctl(2) arguments are not 0 + +ENODEV arg2 is selecting a not supported speculation misfeature +======= ================================================================= + +PR_SET_SPECULATION_CTRL error codes +----------------------------------- +======= ================================================================= +Value Meaning +======= ================================================================= +0 Success + +ERANGE arg3 is incorrect, i.e. it's neither PR_SPEC_ENABLE nor + PR_SPEC_DISABLE + +ENXIO Control of the selected speculation misfeature is not possible. + See PR_GET_SPECULATION_CTRL. +======= ================================================================= + +Speculation misfeature controls +------------------------------- +- PR_SPEC_STORE_BYPASS: Speculative Store Bypass + + Invocations: + * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, 0, 0, 0); + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE, 0, 0); + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE, 0, 0); diff --git a/include/linux/nospec.h b/include/linux/nospec.h index e791ebc..700bb8a 100644 --- a/include/linux/nospec.h +++ b/include/linux/nospec.h @@ -55,4 +55,9 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, \ (typeof(_i)) (_i & _mask); \ }) + +/* Speculation control prctl */ +int arch_prctl_spec_ctrl_get(unsigned long which); +int arch_prctl_spec_ctrl_set(unsigned long which, unsigned long ctrl); + #endif /* _LINUX_NOSPEC_H */ diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index a8d0759..3b316be 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -197,4 +197,15 @@ struct prctl_mm_map { # define PR_CAP_AMBIENT_LOWER 3 # define PR_CAP_AMBIENT_CLEAR_ALL 4
+/* Per task speculation control */ +#define PR_GET_SPECULATION_CTRL 52 +#define PR_SET_SPECULATION_CTRL 53 +/* Speculation control variants */ +# define PR_SPEC_STORE_BYPASS 0 +/* Return and control values for PR_SET/GET_SPECULATION_CTRL */ +# define PR_SPEC_NOT_AFFECTED 0 +# define PR_SPEC_PRCTL (1UL << 0) +# define PR_SPEC_ENABLE (1UL << 1) +# define PR_SPEC_DISABLE (1UL << 2) + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 6624919..d80c33f 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2075,6 +2075,16 @@ static int prctl_get_tid_address(struct task_struct *me, int __user **tid_addr) } #endif
+int __weak arch_prctl_spec_ctrl_get(unsigned long which) +{ + return -EINVAL; +} + +int __weak arch_prctl_spec_ctrl_set(unsigned long which, unsigned long ctrl) +{ + return -EINVAL; +} + SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, unsigned long, arg4, unsigned long, arg5) { @@ -2269,6 +2279,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_GET_FP_MODE: error = GET_FP_MODE(me); break; + case PR_GET_SPECULATION_CTRL: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = arch_prctl_spec_ctrl_get(arg2); + break; + case PR_SET_SPECULATION_CTRL: + if (arg4 || arg5) + return -EINVAL; + error = arch_prctl_spec_ctrl_set(arg2, arg3); + break; default: error = -EINVAL; break;
From: Kyle Huey me@kylehuey.com
commit af8b3cd3934ec60f4c2a420d19a9d416554f140b upstream
Help the compiler to avoid reevaluating the thread flags for each checked bit by reordering the bit checks and providing an explicit xor for evaluation.
With default defconfigs for each arch,
x86_64: arch/x86/kernel/process.o text data bss dec hex 3056 8577 16 11649 2d81 Before 3024 8577 16 11617 2d61 After
i386: arch/x86/kernel/process.o text data bss dec hex 2957 8673 8 11638 2d76 Before 2925 8673 8 11606 2d56 After
Originally-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Kyle Huey khuey@kylehuey.com Cc: Peter Zijlstra peterz@infradead.org Cc: Andy Lutomirski luto@kernel.org Link: http://lkml.kernel.org/r/20170214081104.9244-2-khuey@kylehuey.com Signed-off-by: Thomas Gleixner tglx@linutronix.de
[dwmw2: backported to make TIF_RDS handling simpler. No deferred TR reload.] Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/process.c | 54 ++++++++++++++++++++++++++------------------- 1 file changed, 31 insertions(+), 23 deletions(-)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 7c5c5dc..cc0f288 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -188,48 +188,56 @@ int set_tsc_mode(unsigned int val) return 0; }
+static inline void switch_to_bitmap(struct tss_struct *tss, + struct thread_struct *prev, + struct thread_struct *next, + unsigned long tifp, unsigned long tifn) +{ + if (tifn & _TIF_IO_BITMAP) { + /* + * Copy the relevant range of the IO bitmap. + * Normally this is 128 bytes or less: + */ + memcpy(tss->io_bitmap, next->io_bitmap_ptr, + max(prev->io_bitmap_max, next->io_bitmap_max)); + } else if (tifp & _TIF_IO_BITMAP) { + /* + * Clear any possible leftover bits: + */ + memset(tss->io_bitmap, 0xff, prev->io_bitmap_max); + } +} + void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, struct tss_struct *tss) { struct thread_struct *prev, *next; + unsigned long tifp, tifn;
prev = &prev_p->thread; next = &next_p->thread;
- if (test_tsk_thread_flag(prev_p, TIF_BLOCKSTEP) ^ - test_tsk_thread_flag(next_p, TIF_BLOCKSTEP)) { + tifn = READ_ONCE(task_thread_info(next_p)->flags); + tifp = READ_ONCE(task_thread_info(prev_p)->flags); + switch_to_bitmap(tss, prev, next, tifp, tifn); + + propagate_user_return_notify(prev_p, next_p); + + if ((tifp ^ tifn) & _TIF_BLOCKSTEP) { unsigned long debugctl = get_debugctlmsr();
debugctl &= ~DEBUGCTLMSR_BTF; - if (test_tsk_thread_flag(next_p, TIF_BLOCKSTEP)) + if (tifn & _TIF_BLOCKSTEP) debugctl |= DEBUGCTLMSR_BTF; - update_debugctlmsr(debugctl); }
- if (test_tsk_thread_flag(prev_p, TIF_NOTSC) ^ - test_tsk_thread_flag(next_p, TIF_NOTSC)) { - /* prev and next are different */ - if (test_tsk_thread_flag(next_p, TIF_NOTSC)) + if ((tifp ^ tifn) & _TIF_NOTSC) { + if (tifn & _TIF_NOTSC) hard_disable_TSC(); else hard_enable_TSC(); } - - if (test_tsk_thread_flag(next_p, TIF_IO_BITMAP)) { - /* - * Copy the relevant range of the IO bitmap. - * Normally this is 128 bytes or less: - */ - memcpy(tss->io_bitmap, next->io_bitmap_ptr, - max(prev->io_bitmap_max, next->io_bitmap_max)); - } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) { - /* - * Clear any possible leftover bits: - */ - memset(tss->io_bitmap, 0xff, prev->io_bitmap_max); - } - propagate_user_return_notify(prev_p, next_p); }
/*
From: Kyle Huey me@kylehuey.com
commit b9894a2f5bd18b1691cb6872c9afe32b148d0132 upstream
The debug control MSR is "highly magical" as the blockstep bit can be cleared by hardware under not well documented circumstances.
So a task switch relying on the bit set by the previous task (according to the previous tasks thread flags) can trip over this and not update the flag for the next task.
To fix this its required to handle DEBUGCTLMSR_BTF when either the previous or the next or both tasks have the TIF_BLOCKSTEP flag set.
While at it avoid branching within the TIF_BLOCKSTEP case and evaluating boot_cpu_data twice in kernels without CONFIG_X86_DEBUGCTLMSR.
x86_64: arch/x86/kernel/process.o text data bss dec hex 3024 8577 16 11617 2d61 Before 3008 8577 16 11601 2d51 After
i386: No change
[ tglx: Made the shift value explicit, use a local variable to make the code readable and massaged changelog]
Originally-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Kyle Huey khuey@kylehuey.com Cc: Peter Zijlstra peterz@infradead.org Cc: Andy Lutomirski luto@kernel.org Link: http://lkml.kernel.org/r/20170214081104.9244-3-khuey@kylehuey.com Signed-off-by: Thomas Gleixner tglx@linutronix.de
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/process.c | 12 +++++++----- 2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index a29edb7..71a2c84 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -150,6 +150,7 @@
/* DEBUGCTLMSR bits (others vary by model): */ #define DEBUGCTLMSR_LBR (1UL << 0) /* last branch recording */ +#define DEBUGCTLMSR_BTF_SHIFT 1 #define DEBUGCTLMSR_BTF (1UL << 1) /* single-step on branches */ #define DEBUGCTLMSR_TR (1UL << 6) #define DEBUGCTLMSR_BTS (1UL << 7) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index cc0f288..166aef3 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -223,13 +223,15 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
propagate_user_return_notify(prev_p, next_p);
- if ((tifp ^ tifn) & _TIF_BLOCKSTEP) { - unsigned long debugctl = get_debugctlmsr(); + if ((tifp & _TIF_BLOCKSTEP || tifn & _TIF_BLOCKSTEP) && + arch_has_block_step()) { + unsigned long debugctl, msk;
+ rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl); debugctl &= ~DEBUGCTLMSR_BTF; - if (tifn & _TIF_BLOCKSTEP) - debugctl |= DEBUGCTLMSR_BTF; - update_debugctlmsr(debugctl); + msk = tifn & _TIF_BLOCKSTEP; + debugctl |= (msk >> TIF_BLOCKSTEP) << DEBUGCTLMSR_BTF_SHIFT; + wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl); }
if ((tifp ^ tifn) & _TIF_NOTSC) {
From: Thomas Gleixner tglx@linutronix.de
commit 5a920155e388ec22a22e0532fb695b9215c9b34d upstream
Provide and use a toggle helper instead of doing it with a branch.
x86_64: arch/x86/kernel/process.o text data bss dec hex 3008 8577 16 11601 2d51 Before 2976 8577 16 11569 2d31 After
i386: arch/x86/kernel/process.o text data bss dec hex 2925 8673 8 11606 2d56 Before 2893 8673 8 11574 2d36 After
Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Peter Zijlstra peterz@infradead.org Cc: Andy Lutomirski luto@kernel.org Link: http://lkml.kernel.org/r/20170214081104.9244-4-khuey@kylehuey.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/tlbflush.h | 10 ++++++++++ arch/x86/kernel/process.c | 22 ++++------------------ 2 files changed, 14 insertions(+), 18 deletions(-)
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 8ce07db..72cfe3e 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -111,6 +111,16 @@ static inline void cr4_clear_bits(unsigned long mask) } }
+static inline void cr4_toggle_bits(unsigned long mask) +{ + unsigned long cr4; + + cr4 = this_cpu_read(cpu_tlbstate.cr4); + cr4 ^= mask; + this_cpu_write(cpu_tlbstate.cr4, cr4); + __write_cr4(cr4); +} + /* Read the CR4 shadow. */ static inline unsigned long cr4_read_shadow(void) { diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 166aef3..d112963 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -130,11 +130,6 @@ void flush_thread(void) fpu__clear(&tsk->thread.fpu); }
-static void hard_disable_TSC(void) -{ - cr4_set_bits(X86_CR4_TSD); -} - void disable_TSC(void) { preempt_disable(); @@ -143,15 +138,10 @@ void disable_TSC(void) * Must flip the CPU state synchronously with * TIF_NOTSC in the current running context. */ - hard_disable_TSC(); + cr4_set_bits(X86_CR4_TSD); preempt_enable(); }
-static void hard_enable_TSC(void) -{ - cr4_clear_bits(X86_CR4_TSD); -} - static void enable_TSC(void) { preempt_disable(); @@ -160,7 +150,7 @@ static void enable_TSC(void) * Must flip the CPU state synchronously with * TIF_NOTSC in the current running context. */ - hard_enable_TSC(); + cr4_clear_bits(X86_CR4_TSD); preempt_enable(); }
@@ -234,12 +224,8 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl); }
- if ((tifp ^ tifn) & _TIF_NOTSC) { - if (tifn & _TIF_NOTSC) - hard_disable_TSC(); - else - hard_enable_TSC(); - } + if ((tifp ^ tifn) & _TIF_NOTSC) + cr4_toggle_bits(X86_CR4_TSD); }
/*
From: Thomas Gleixner tglx@linutronix.de
commit 885f82bfbc6fefb6664ea27965c3ab9ac4194b8c upstream
The Speculative Store Bypass vulnerability can be mitigated with the Reduced Data Speculation (RDS) feature. To allow finer grained control of this eventually expensive mitigation a per task mitigation control is required.
Add a new TIF_RDS flag and put it into the group of TIF flags which are evaluated for mismatch in switch_to(). If these bits differ in the previous and the next task, then the slow path function __switch_to_xtra() is invoked. Implement the TIF_RDS dependent mitigation control in the slow path.
If the prctl for controlling Speculative Store Bypass is disabled or no task uses the prctl then there is no overhead in the switch_to() fast path.
Update the KVM related speculation control functions to take TID_RDS into account as well.
Based on a patch from Tim Chen. Completely rewritten.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ingo Molnar mingo@kernel.org Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/msr-index.h | 3 ++- arch/x86/include/asm/spec-ctrl.h | 17 +++++++++++++++++ arch/x86/include/asm/thread_info.h | 6 ++++-- arch/x86/kernel/cpu/bugs.c | 26 +++++++++++++++++++++----- arch/x86/kernel/process.c | 22 ++++++++++++++++++++++ 5 files changed, 66 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 71a2c84..883cf0d 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -35,7 +35,8 @@ #define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */ #define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */ #define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */ -#define SPEC_CTRL_RDS (1 << 2) /* Reduced Data Speculation */ +#define SPEC_CTRL_RDS_SHIFT 2 /* Reduced Data Speculation bit */ +#define SPEC_CTRL_RDS (1 << SPEC_CTRL_RDS_SHIFT) /* Reduced Data Speculation */
#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */ #define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */ diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h index 3ad6442..45ef00a 100644 --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_SPECCTRL_H_ #define _ASM_X86_SPECCTRL_H_
+#include <linux/thread_info.h> #include <asm/nospec-branch.h>
/* @@ -18,4 +19,20 @@ extern void x86_spec_ctrl_restore_host(u64); extern u64 x86_amd_ls_cfg_base; extern u64 x86_amd_ls_cfg_rds_mask;
+/* The Intel SPEC CTRL MSR base value cache */ +extern u64 x86_spec_ctrl_base; + +static inline u64 rds_tif_to_spec_ctrl(u64 tifn) +{ + BUILD_BUG_ON(TIF_RDS < SPEC_CTRL_RDS_SHIFT); + return (tifn & _TIF_RDS) >> (TIF_RDS - SPEC_CTRL_RDS_SHIFT); +} + +static inline u64 rds_tif_to_amd_ls_cfg(u64 tifn) +{ + return (tifn & _TIF_RDS) ? x86_amd_ls_cfg_rds_mask : 0ULL; +} + +extern void speculative_store_bypass_update(void); + #endif diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 18c9aaa..36a49b4 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -92,6 +92,7 @@ struct thread_info { #define TIF_SIGPENDING 2 /* signal pending */ #define TIF_NEED_RESCHED 3 /* rescheduling necessary */ #define TIF_SINGLESTEP 4 /* reenable singlestep on user return*/ +#define TIF_RDS 5 /* Reduced data speculation */ #define TIF_SYSCALL_EMU 6 /* syscall emulation active */ #define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */ #define TIF_SECCOMP 8 /* secure computing */ @@ -114,8 +115,9 @@ struct thread_info { #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE) #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) -#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) +#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP) +#define _TIF_RDS (1 << TIF_RDS) #define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU) #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) #define _TIF_SECCOMP (1 << TIF_SECCOMP) @@ -147,7 +149,7 @@ struct thread_info {
/* flags to check in __switch_to() */ #define _TIF_WORK_CTXSW \ - (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP) + (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP|_TIF_RDS)
#define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY) #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW) diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 47a3cc0..0f8303e 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -32,7 +32,7 @@ static void __init ssb_select_mitigation(void); * Our boot-time value of the SPEC_CTRL MSR. We read it once so that any * writes to SPEC_CTRL contain whatever reserved bits have been set. */ -static u64 x86_spec_ctrl_base; +u64 x86_spec_ctrl_base;
/* * The vendor and possibly platform specific bits which can be modified in @@ -139,25 +139,41 @@ EXPORT_SYMBOL_GPL(x86_spec_ctrl_set);
u64 x86_spec_ctrl_get_default(void) { - return x86_spec_ctrl_base; + u64 msrval = x86_spec_ctrl_base; + + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) + msrval |= rds_tif_to_spec_ctrl(current_thread_info()->flags); + return msrval; } EXPORT_SYMBOL_GPL(x86_spec_ctrl_get_default);
void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl) { + u64 host = x86_spec_ctrl_base; + if (!boot_cpu_has(X86_FEATURE_IBRS)) return; - if (x86_spec_ctrl_base != guest_spec_ctrl) + + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) + host |= rds_tif_to_spec_ctrl(current_thread_info()->flags); + + if (host != guest_spec_ctrl) wrmsrl(MSR_IA32_SPEC_CTRL, guest_spec_ctrl); } EXPORT_SYMBOL_GPL(x86_spec_ctrl_set_guest);
void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl) { + u64 host = x86_spec_ctrl_base; + if (!boot_cpu_has(X86_FEATURE_IBRS)) return; - if (x86_spec_ctrl_base != guest_spec_ctrl) - wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); + + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) + host |= rds_tif_to_spec_ctrl(current_thread_info()->flags); + + if (host != guest_spec_ctrl) + wrmsrl(MSR_IA32_SPEC_CTRL, host); } EXPORT_SYMBOL_GPL(x86_spec_ctrl_restore_host);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index d112963..9689e92 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -31,6 +31,7 @@ #include <asm/tlbflush.h> #include <asm/mce.h> #include <asm/vm86.h> +#include <asm/spec-ctrl.h>
/* * per-CPU TSS segments. Threads are completely 'soft' on Linux, @@ -198,6 +199,24 @@ static inline void switch_to_bitmap(struct tss_struct *tss, } }
+static __always_inline void __speculative_store_bypass_update(unsigned long tifn) +{ + u64 msr; + + if (static_cpu_has(X86_FEATURE_AMD_RDS)) { + msr = x86_amd_ls_cfg_base | rds_tif_to_amd_ls_cfg(tifn); + wrmsrl(MSR_AMD64_LS_CFG, msr); + } else { + msr = x86_spec_ctrl_base | rds_tif_to_spec_ctrl(tifn); + wrmsrl(MSR_IA32_SPEC_CTRL, msr); + } +} + +void speculative_store_bypass_update(void) +{ + __speculative_store_bypass_update(current_thread_info()->flags); +} + void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, struct tss_struct *tss) { @@ -226,6 +245,9 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
if ((tifp ^ tifn) & _TIF_NOTSC) cr4_toggle_bits(X86_CR4_TSD); + + if ((tifp ^ tifn) & _TIF_RDS) + __speculative_store_bypass_update(tifn); }
/*
From: Thomas Gleixner tglx@linutronix.de
commit a73ec77ee17ec556fe7f165d00314cb7c047b1ac upstream
Add prctl based control for Speculative Store Bypass mitigation and make it the default mitigation for Intel and AMD.
Andi Kleen provided the following rationale (slightly redacted):
There are multiple levels of impact of Speculative Store Bypass:
1) JITed sandbox. It cannot invoke system calls, but can do PRIME+PROBE and may have call interfaces to other code
2) Native code process. No protection inside the process at this level.
3) Kernel.
4) Between processes.
The prctl tries to protect against case (1) doing attacks.
If the untrusted code can do random system calls then control is already lost in a much worse way. So there needs to be system call protection in some way (using a JIT not allowing them or seccomp). Or rather if the process can subvert its environment somehow to do the prctl it can already execute arbitrary code, which is much worse than SSB.
To put it differently, the point of the prctl is to not allow JITed code to read data it shouldn't read from its JITed sandbox. If it already has escaped its sandbox then it can already read everything it wants in its address space, and do much worse.
The ability to control Speculative Store Bypass allows to enable the protection selectively without affecting overall system performance.
Based on an initial patch from Tim Chen. Completely rewritten.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
Documentation/kernel-parameters.txt | 6 ++ arch/x86/include/asm/nospec-branch.h | 1 arch/x86/kernel/cpu/bugs.c | 83 ++++++++++++++++++++++++++++++---- 3 files changed, 79 insertions(+), 11 deletions(-)
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index dc138b8..80202de 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -3651,7 +3651,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. off - Unconditionally enable Speculative Store Bypass auto - Kernel detects whether the CPU model contains an implementation of Speculative Store Bypass and - picks the most appropriate mitigation + picks the most appropriate mitigation. + prctl - Control Speculative Store Bypass per thread + via prctl. Speculative Store Bypass is enabled + for a process by default. The state of the control + is inherited on fork.
Not specifying this option is equivalent to spec_store_bypass_disable=auto. diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 47c454c..155d955 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -187,6 +187,7 @@ extern u64 x86_spec_ctrl_get_default(void); enum ssb_mitigation { SPEC_STORE_BYPASS_NONE, SPEC_STORE_BYPASS_DISABLE, + SPEC_STORE_BYPASS_PRCTL, };
extern char __indirect_thunk_start[]; diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 0f8303e..bcfccd3 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -11,6 +11,8 @@ #include <linux/utsname.h> #include <linux/cpu.h> #include <linux/module.h> +#include <linux/nospec.h> +#include <linux/prctl.h>
#include <asm/spec-ctrl.h> #include <asm/cmdline.h> @@ -411,20 +413,23 @@ enum ssb_mitigation_cmd { SPEC_STORE_BYPASS_CMD_NONE, SPEC_STORE_BYPASS_CMD_AUTO, SPEC_STORE_BYPASS_CMD_ON, + SPEC_STORE_BYPASS_CMD_PRCTL, };
static const char *ssb_strings[] = { [SPEC_STORE_BYPASS_NONE] = "Vulnerable", - [SPEC_STORE_BYPASS_DISABLE] = "Mitigation: Speculative Store Bypass disabled" + [SPEC_STORE_BYPASS_DISABLE] = "Mitigation: Speculative Store Bypass disabled", + [SPEC_STORE_BYPASS_PRCTL] = "Mitigation: Speculative Store Bypass disabled via prctl" };
static const struct { const char *option; enum ssb_mitigation_cmd cmd; } ssb_mitigation_options[] = { - { "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */ - { "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */ - { "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */ + { "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */ + { "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */ + { "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */ + { "prctl", SPEC_STORE_BYPASS_CMD_PRCTL }, /* Disable Speculative Store Bypass via prctl */ };
static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void) @@ -474,14 +479,15 @@ static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void)
switch (cmd) { case SPEC_STORE_BYPASS_CMD_AUTO: - /* - * AMD platforms by default don't need SSB mitigation. - */ - if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) - break; + /* Choose prctl as the default mode */ + mode = SPEC_STORE_BYPASS_PRCTL; + break; case SPEC_STORE_BYPASS_CMD_ON: mode = SPEC_STORE_BYPASS_DISABLE; break; + case SPEC_STORE_BYPASS_CMD_PRCTL: + mode = SPEC_STORE_BYPASS_PRCTL; + break; case SPEC_STORE_BYPASS_CMD_NONE: break; } @@ -492,7 +498,7 @@ static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void) * - X86_FEATURE_RDS - CPU is able to turn off speculative store bypass * - X86_FEATURE_SPEC_STORE_BYPASS_DISABLE - engage the mitigation */ - if (mode != SPEC_STORE_BYPASS_NONE) { + if (mode == SPEC_STORE_BYPASS_DISABLE) { setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE); /* * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD uses @@ -523,6 +529,63 @@ static void ssb_select_mitigation()
#undef pr_fmt
+static int ssb_prctl_set(unsigned long ctrl) +{ + bool rds = !!test_tsk_thread_flag(current, TIF_RDS); + + if (ssb_mode != SPEC_STORE_BYPASS_PRCTL) + return -ENXIO; + + if (ctrl == PR_SPEC_ENABLE) + clear_tsk_thread_flag(current, TIF_RDS); + else + set_tsk_thread_flag(current, TIF_RDS); + + if (rds != !!test_tsk_thread_flag(current, TIF_RDS)) + speculative_store_bypass_update(); + + return 0; +} + +static int ssb_prctl_get(void) +{ + switch (ssb_mode) { + case SPEC_STORE_BYPASS_DISABLE: + return PR_SPEC_DISABLE; + case SPEC_STORE_BYPASS_PRCTL: + if (test_tsk_thread_flag(current, TIF_RDS)) + return PR_SPEC_PRCTL | PR_SPEC_DISABLE; + return PR_SPEC_PRCTL | PR_SPEC_ENABLE; + default: + if (boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS)) + return PR_SPEC_ENABLE; + return PR_SPEC_NOT_AFFECTED; + } +} + +int arch_prctl_spec_ctrl_set(unsigned long which, unsigned long ctrl) +{ + if (ctrl != PR_SPEC_ENABLE && ctrl != PR_SPEC_DISABLE) + return -ERANGE; + + switch (which) { + case PR_SPEC_STORE_BYPASS: + return ssb_prctl_set(ctrl); + default: + return -ENODEV; + } +} + +int arch_prctl_spec_ctrl_get(unsigned long which) +{ + switch (which) { + case PR_SPEC_STORE_BYPASS: + return ssb_prctl_get(); + default: + return -ENODEV; + } +} + void x86_spec_ctrl_setup_ap(void) { if (boot_cpu_has(X86_FEATURE_IBRS))
From: Kees Cook keescook@chromium.org
commit 7bbf1373e228840bb0295a2ca26d548ef37f448e upstream
Adjust arch_prctl_get/set_spec_ctrl() to operate on tasks other than current.
This is needed both for /proc/$pid/status queries and for seccomp (since thread-syncing can trigger seccomp in non-current threads).
Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/bugs.c | 27 ++++++++++++++++----------- include/linux/nospec.h | 7 +++++-- kernel/sys.c | 9 +++++---- 3 files changed, 26 insertions(+), 17 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index bcfccd3..64b54a4 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -529,31 +529,35 @@ static void ssb_select_mitigation()
#undef pr_fmt
-static int ssb_prctl_set(unsigned long ctrl) +static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) { - bool rds = !!test_tsk_thread_flag(current, TIF_RDS); + bool rds = !!test_tsk_thread_flag(task, TIF_RDS);
if (ssb_mode != SPEC_STORE_BYPASS_PRCTL) return -ENXIO;
if (ctrl == PR_SPEC_ENABLE) - clear_tsk_thread_flag(current, TIF_RDS); + clear_tsk_thread_flag(task, TIF_RDS); else - set_tsk_thread_flag(current, TIF_RDS); + set_tsk_thread_flag(task, TIF_RDS);
- if (rds != !!test_tsk_thread_flag(current, TIF_RDS)) + /* + * If being set on non-current task, delay setting the CPU + * mitigation until it is next scheduled. + */ + if (task == current && rds != !!test_tsk_thread_flag(task, TIF_RDS)) speculative_store_bypass_update();
return 0; }
-static int ssb_prctl_get(void) +static int ssb_prctl_get(struct task_struct *task) { switch (ssb_mode) { case SPEC_STORE_BYPASS_DISABLE: return PR_SPEC_DISABLE; case SPEC_STORE_BYPASS_PRCTL: - if (test_tsk_thread_flag(current, TIF_RDS)) + if (test_tsk_thread_flag(task, TIF_RDS)) return PR_SPEC_PRCTL | PR_SPEC_DISABLE; return PR_SPEC_PRCTL | PR_SPEC_ENABLE; default: @@ -563,24 +567,25 @@ static int ssb_prctl_get(void) } }
-int arch_prctl_spec_ctrl_set(unsigned long which, unsigned long ctrl) +int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, + unsigned long ctrl) { if (ctrl != PR_SPEC_ENABLE && ctrl != PR_SPEC_DISABLE) return -ERANGE;
switch (which) { case PR_SPEC_STORE_BYPASS: - return ssb_prctl_set(ctrl); + return ssb_prctl_set(task, ctrl); default: return -ENODEV; } }
-int arch_prctl_spec_ctrl_get(unsigned long which) +int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which) { switch (which) { case PR_SPEC_STORE_BYPASS: - return ssb_prctl_get(); + return ssb_prctl_get(task); default: return -ENODEV; } diff --git a/include/linux/nospec.h b/include/linux/nospec.h index 700bb8a..a908c95 100644 --- a/include/linux/nospec.h +++ b/include/linux/nospec.h @@ -7,6 +7,8 @@ #define _LINUX_NOSPEC_H #include <asm/barrier.h>
+struct task_struct; + /** * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise * @index: array element index @@ -57,7 +59,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, })
/* Speculation control prctl */ -int arch_prctl_spec_ctrl_get(unsigned long which); -int arch_prctl_spec_ctrl_set(unsigned long which, unsigned long ctrl); +int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which); +int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, + unsigned long ctrl);
#endif /* _LINUX_NOSPEC_H */ diff --git a/kernel/sys.c b/kernel/sys.c index d80c33f..f718742 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2075,12 +2075,13 @@ static int prctl_get_tid_address(struct task_struct *me, int __user **tid_addr) } #endif
-int __weak arch_prctl_spec_ctrl_get(unsigned long which) +int __weak arch_prctl_spec_ctrl_get(struct task_struct *t, unsigned long which) { return -EINVAL; }
-int __weak arch_prctl_spec_ctrl_set(unsigned long which, unsigned long ctrl) +int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which, + unsigned long ctrl) { return -EINVAL; } @@ -2282,12 +2283,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_GET_SPECULATION_CTRL: if (arg3 || arg4 || arg5) return -EINVAL; - error = arch_prctl_spec_ctrl_get(arg2); + error = arch_prctl_spec_ctrl_get(me, arg2); break; case PR_SET_SPECULATION_CTRL: if (arg4 || arg5) return -EINVAL; - error = arch_prctl_spec_ctrl_set(arg2, arg3); + error = arch_prctl_spec_ctrl_set(me, arg2, arg3); break; default: error = -EINVAL;
From: Kees Cook keescook@chromium.org
commit fae1fa0fc6cca8beee3ab8ed71d54f9a78fa3f64 upstream
As done with seccomp and no_new_privs, also show speculation flaw mitigation state in /proc/$pid/status.
Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
fs/proc/array.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
diff --git a/fs/proc/array.c b/fs/proc/array.c index b6c00ce..bb48358 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -79,6 +79,7 @@ #include <linux/delayacct.h> #include <linux/seq_file.h> #include <linux/pid_namespace.h> +#include <linux/prctl.h> #include <linux/ptrace.h> #include <linux/tracehook.h> #include <linux/string_helpers.h> @@ -332,6 +333,28 @@ static inline void task_seccomp(struct seq_file *m, struct task_struct *p) #ifdef CONFIG_SECCOMP seq_printf(m, "Seccomp:\t%d\n", p->seccomp.mode); #endif + seq_printf(m, "\nSpeculation Store Bypass:\t"); + switch (arch_prctl_spec_ctrl_get(p, PR_SPEC_STORE_BYPASS)) { + case -EINVAL: + seq_printf(m, "unknown"); + break; + case PR_SPEC_NOT_AFFECTED: + seq_printf(m, "not vulnerable"); + break; + case PR_SPEC_PRCTL | PR_SPEC_DISABLE: + seq_printf(m, "thread mitigated"); + break; + case PR_SPEC_PRCTL | PR_SPEC_ENABLE: + seq_printf(m, "thread vulnerable"); + break; + case PR_SPEC_DISABLE: + seq_printf(m, "globally mitigated"); + break; + default: + seq_printf(m, "vulnerable"); + break; + } + seq_putc(m, '\n'); }
static inline void task_context_switch_counts(struct seq_file *m,
From: Kees Cook keescook@chromium.org
commit 5c3070890d06ff82eecb808d02d2ca39169533ef upstream
When speculation flaw mitigations are opt-in (via prctl), using seccomp will automatically opt-in to these protections, since using seccomp indicates at least some level of sandboxing is desired.
Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
kernel/seccomp.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/kernel/seccomp.c b/kernel/seccomp.c index efd384f..bfb1ee8 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -16,6 +16,8 @@ #include <linux/atomic.h> #include <linux/audit.h> #include <linux/compat.h> +#include <linux/nospec.h> +#include <linux/prctl.h> #include <linux/sched.h> #include <linux/seccomp.h> #include <linux/slab.h> @@ -214,6 +216,19 @@ static inline bool seccomp_may_assign_mode(unsigned long seccomp_mode) return true; }
+/* + * If a given speculation mitigation is opt-in (prctl()-controlled), + * select it, by disabling speculation (enabling mitigation). + */ +static inline void spec_mitigate(struct task_struct *task, + unsigned long which) +{ + int state = arch_prctl_spec_ctrl_get(task, which); + + if (state > 0 && (state & PR_SPEC_PRCTL)) + arch_prctl_spec_ctrl_set(task, which, PR_SPEC_DISABLE); +} + static inline void seccomp_assign_mode(struct task_struct *task, unsigned long seccomp_mode) { @@ -225,6 +240,8 @@ static inline void seccomp_assign_mode(struct task_struct *task, * filter) is set. */ smp_mb__before_atomic(); + /* Assume seccomp processes want speculation flaw mitigation. */ + spec_mitigate(task, PR_SPEC_STORE_BYPASS); set_tsk_thread_flag(task, TIF_SECCOMP); }
From: Thomas Gleixner tglx@linutronix.de
commit 356e4bfff2c5489e016fdb925adbf12a1e3950ee upstream
For certain use cases it is desired to enforce mitigations so they cannot be undone afterwards. That's important for loader stubs which want to prevent a child from disabling the mitigation again. Will also be used for seccomp(). The extra state preserving of the prctl state for SSB is a preparatory step for EBPF dymanic speculation control.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
Documentation/spec_ctrl.txt | 34 +++++++++++++++++++++------------- arch/x86/kernel/cpu/bugs.c | 35 +++++++++++++++++++++++++---------- fs/proc/array.c | 3 +++ include/linux/sched.h | 9 +++++++++ include/uapi/linux/prctl.h | 1 + 5 files changed, 59 insertions(+), 23 deletions(-)
diff --git a/Documentation/spec_ctrl.txt b/Documentation/spec_ctrl.txt index ddbebcd..1b3690d 100644 --- a/Documentation/spec_ctrl.txt +++ b/Documentation/spec_ctrl.txt @@ -25,19 +25,21 @@ PR_GET_SPECULATION_CTRL -----------------------
PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature -which is selected with arg2 of prctl(2). The return value uses bits 0-2 with +which is selected with arg2 of prctl(2). The return value uses bits 0-3 with the following meaning:
-==== ================ =================================================== -Bit Define Description -==== ================ =================================================== -0 PR_SPEC_PRCTL Mitigation can be controlled per task by - PR_SET_SPECULATION_CTRL -1 PR_SPEC_ENABLE The speculation feature is enabled, mitigation is - disabled -2 PR_SPEC_DISABLE The speculation feature is disabled, mitigation is - enabled -==== ================ =================================================== +==== ===================== =================================================== +Bit Define Description +==== ===================== =================================================== +0 PR_SPEC_PRCTL Mitigation can be controlled per task by + PR_SET_SPECULATION_CTRL +1 PR_SPEC_ENABLE The speculation feature is enabled, mitigation is + disabled +2 PR_SPEC_DISABLE The speculation feature is disabled, mitigation is + enabled +3 PR_SPEC_FORCE_DISABLE Same as PR_SPEC_DISABLE, but cannot be undone. A + subsequent prctl(..., PR_SPEC_ENABLE) will fail. +==== ===================== ===================================================
If all bits are 0 the CPU is not affected by the speculation misfeature.
@@ -47,9 +49,11 @@ misfeature will fail.
PR_SET_SPECULATION_CTRL ----------------------- + PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which is selected by arg2 of :manpage:`prctl(2)` per task. arg3 is used to hand -in the control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE. +in the control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE or +PR_SPEC_FORCE_DISABLE.
Common error codes ------------------ @@ -70,10 +74,13 @@ Value Meaning 0 Success
ERANGE arg3 is incorrect, i.e. it's neither PR_SPEC_ENABLE nor - PR_SPEC_DISABLE + PR_SPEC_DISABLE nor PR_SPEC_FORCE_DISABLE
ENXIO Control of the selected speculation misfeature is not possible. See PR_GET_SPECULATION_CTRL. + +EPERM Speculation was disabled with PR_SPEC_FORCE_DISABLE and caller + tried to enable it again. ======= =================================================================
Speculation misfeature controls @@ -84,3 +91,4 @@ Speculation misfeature controls * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, 0, 0, 0); * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE, 0, 0); * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE, 0, 0); + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_FORCE_DISABLE, 0, 0); diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 64b54a4..d6897ca 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -531,21 +531,37 @@ static void ssb_select_mitigation()
static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) { - bool rds = !!test_tsk_thread_flag(task, TIF_RDS); + bool update;
if (ssb_mode != SPEC_STORE_BYPASS_PRCTL) return -ENXIO;
- if (ctrl == PR_SPEC_ENABLE) - clear_tsk_thread_flag(task, TIF_RDS); - else - set_tsk_thread_flag(task, TIF_RDS); + switch (ctrl) { + case PR_SPEC_ENABLE: + /* If speculation is force disabled, enable is not allowed */ + if (task_spec_ssb_force_disable(task)) + return -EPERM; + task_clear_spec_ssb_disable(task); + update = test_and_clear_tsk_thread_flag(task, TIF_RDS); + break; + case PR_SPEC_DISABLE: + task_set_spec_ssb_disable(task); + update = !test_and_set_tsk_thread_flag(task, TIF_RDS); + break; + case PR_SPEC_FORCE_DISABLE: + task_set_spec_ssb_disable(task); + task_set_spec_ssb_force_disable(task); + update = !test_and_set_tsk_thread_flag(task, TIF_RDS); + break; + default: + return -ERANGE; + }
/* * If being set on non-current task, delay setting the CPU * mitigation until it is next scheduled. */ - if (task == current && rds != !!test_tsk_thread_flag(task, TIF_RDS)) + if (task == current && update) speculative_store_bypass_update();
return 0; @@ -557,7 +573,9 @@ static int ssb_prctl_get(struct task_struct *task) case SPEC_STORE_BYPASS_DISABLE: return PR_SPEC_DISABLE; case SPEC_STORE_BYPASS_PRCTL: - if (test_tsk_thread_flag(task, TIF_RDS)) + if (task_spec_ssb_force_disable(task)) + return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE; + if (task_spec_ssb_disable(task)) return PR_SPEC_PRCTL | PR_SPEC_DISABLE; return PR_SPEC_PRCTL | PR_SPEC_ENABLE; default: @@ -570,9 +588,6 @@ static int ssb_prctl_get(struct task_struct *task) int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, unsigned long ctrl) { - if (ctrl != PR_SPEC_ENABLE && ctrl != PR_SPEC_DISABLE) - return -ERANGE; - switch (which) { case PR_SPEC_STORE_BYPASS: return ssb_prctl_set(task, ctrl); diff --git a/fs/proc/array.c b/fs/proc/array.c index bb48358..3141478 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -341,6 +341,9 @@ static inline void task_seccomp(struct seq_file *m, struct task_struct *p) case PR_SPEC_NOT_AFFECTED: seq_printf(m, "not vulnerable"); break; + case PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE: + seq_printf(m, "thread force mitigated"); + break; case PR_SPEC_PRCTL | PR_SPEC_DISABLE: seq_printf(m, "thread mitigated"); break; diff --git a/include/linux/sched.h b/include/linux/sched.h index 90bea39..725498c 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2167,6 +2167,8 @@ static inline void memalloc_noio_restore(unsigned int flags) #define PFA_NO_NEW_PRIVS 0 /* May not gain new privileges. */ #define PFA_SPREAD_PAGE 1 /* Spread page cache over cpuset */ #define PFA_SPREAD_SLAB 2 /* Spread some slab caches over cpuset */ +#define PFA_SPEC_SSB_DISABLE 4 /* Speculative Store Bypass disabled */ +#define PFA_SPEC_SSB_FORCE_DISABLE 5 /* Speculative Store Bypass force disabled*/
#define TASK_PFA_TEST(name, func) \ @@ -2190,6 +2192,13 @@ TASK_PFA_TEST(SPREAD_SLAB, spread_slab) TASK_PFA_SET(SPREAD_SLAB, spread_slab) TASK_PFA_CLEAR(SPREAD_SLAB, spread_slab)
+TASK_PFA_TEST(SPEC_SSB_DISABLE, spec_ssb_disable) +TASK_PFA_SET(SPEC_SSB_DISABLE, spec_ssb_disable) +TASK_PFA_CLEAR(SPEC_SSB_DISABLE, spec_ssb_disable) + +TASK_PFA_TEST(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable) +TASK_PFA_SET(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable) + /* * task->jobctl flags */ diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 3b316be..64776b7 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -207,5 +207,6 @@ struct prctl_mm_map { # define PR_SPEC_PRCTL (1UL << 0) # define PR_SPEC_ENABLE (1UL << 1) # define PR_SPEC_DISABLE (1UL << 2) +# define PR_SPEC_FORCE_DISABLE (1UL << 3)
#endif /* _LINUX_PRCTL_H */
From: Thomas Gleixner tglx@linutronix.de
commit b849a812f7eb92e96d1c8239b06581b2cfd8b275 upstream
Use PR_SPEC_FORCE_DISABLE in seccomp() because seccomp does not allow to widen restrictions.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
kernel/seccomp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/seccomp.c b/kernel/seccomp.c index bfb1ee8..f33539f 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -226,7 +226,7 @@ static inline void spec_mitigate(struct task_struct *task, int state = arch_prctl_spec_ctrl_get(task, which);
if (state > 0 && (state & PR_SPEC_PRCTL)) - arch_prctl_spec_ctrl_set(task, which, PR_SPEC_DISABLE); + arch_prctl_spec_ctrl_set(task, which, PR_SPEC_FORCE_DISABLE); }
static inline void seccomp_assign_mode(struct task_struct *task,
From: Kees Cook keescook@chromium.org
commit 00a02d0c502a06d15e07b857f8ff921e3e402675 upstream
If a seccomp user is not interested in Speculative Store Bypass mitigation by default, it can set the new SECCOMP_FILTER_FLAG_SPEC_ALLOW flag when adding filters.
Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
include/linux/seccomp.h | 3 + include/uapi/linux/seccomp.h | 4 + kernel/seccomp.c | 19 ++++-- tools/testing/selftests/seccomp/seccomp_bpf.c | 78 +++++++++++++++++++++++++ 4 files changed, 93 insertions(+), 11 deletions(-)
diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h index 2296e6b..5a53d34 100644 --- a/include/linux/seccomp.h +++ b/include/linux/seccomp.h @@ -3,7 +3,8 @@
#include <uapi/linux/seccomp.h>
-#define SECCOMP_FILTER_FLAG_MASK (SECCOMP_FILTER_FLAG_TSYNC) +#define SECCOMP_FILTER_FLAG_MASK (SECCOMP_FILTER_FLAG_TSYNC | \ + SECCOMP_FILTER_FLAG_SPEC_ALLOW)
#ifdef CONFIG_SECCOMP
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h index 0f238a4..e4acb61 100644 --- a/include/uapi/linux/seccomp.h +++ b/include/uapi/linux/seccomp.h @@ -15,7 +15,9 @@ #define SECCOMP_SET_MODE_FILTER 1
/* Valid flags for SECCOMP_SET_MODE_FILTER */ -#define SECCOMP_FILTER_FLAG_TSYNC 1 +#define SECCOMP_FILTER_FLAG_TSYNC (1UL << 0) +/* In v4.14+ SECCOMP_FILTER_FLAG_LOG is (1UL << 1) */ +#define SECCOMP_FILTER_FLAG_SPEC_ALLOW (1UL << 2)
/* * All BPF programs must return a 32-bit value. diff --git a/kernel/seccomp.c b/kernel/seccomp.c index f33539f..4bb8a5a 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -230,7 +230,8 @@ static inline void spec_mitigate(struct task_struct *task, }
static inline void seccomp_assign_mode(struct task_struct *task, - unsigned long seccomp_mode) + unsigned long seccomp_mode, + unsigned long flags) { assert_spin_locked(&task->sighand->siglock);
@@ -240,8 +241,9 @@ static inline void seccomp_assign_mode(struct task_struct *task, * filter) is set. */ smp_mb__before_atomic(); - /* Assume seccomp processes want speculation flaw mitigation. */ - spec_mitigate(task, PR_SPEC_STORE_BYPASS); + /* Assume default seccomp processes want spec flaw mitigation. */ + if ((flags & SECCOMP_FILTER_FLAG_SPEC_ALLOW) == 0) + spec_mitigate(task, PR_SPEC_STORE_BYPASS); set_tsk_thread_flag(task, TIF_SECCOMP); }
@@ -309,7 +311,7 @@ static inline pid_t seccomp_can_sync_threads(void) * without dropping the locks. * */ -static inline void seccomp_sync_threads(void) +static inline void seccomp_sync_threads(unsigned long flags) { struct task_struct *thread, *caller;
@@ -350,7 +352,8 @@ static inline void seccomp_sync_threads(void) * allow one thread to transition the other. */ if (thread->seccomp.mode == SECCOMP_MODE_DISABLED) - seccomp_assign_mode(thread, SECCOMP_MODE_FILTER); + seccomp_assign_mode(thread, SECCOMP_MODE_FILTER, + flags); } }
@@ -469,7 +472,7 @@ static long seccomp_attach_filter(unsigned int flags,
/* Now that the new filter is in place, synchronize to all threads. */ if (flags & SECCOMP_FILTER_FLAG_TSYNC) - seccomp_sync_threads(); + seccomp_sync_threads(flags);
return 0; } @@ -764,7 +767,7 @@ static long seccomp_set_mode_strict(void) #ifdef TIF_NOTSC disable_TSC(); #endif - seccomp_assign_mode(current, seccomp_mode); + seccomp_assign_mode(current, seccomp_mode, 0); ret = 0;
out: @@ -822,7 +825,7 @@ static long seccomp_set_mode_filter(unsigned int flags, /* Do not free the successfully attached filter. */ prepared = NULL;
- seccomp_assign_mode(current, seccomp_mode); + seccomp_assign_mode(current, seccomp_mode, flags); out: spin_unlock_irq(¤t->sighand->siglock); if (flags & SECCOMP_FILTER_FLAG_TSYNC) diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 29487e0..b3f3454 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -1477,7 +1477,11 @@ TEST_F(TRACE_syscall, syscall_dropped) #endif
#ifndef SECCOMP_FILTER_FLAG_TSYNC -#define SECCOMP_FILTER_FLAG_TSYNC 1 +#define SECCOMP_FILTER_FLAG_TSYNC (1UL << 0) +#endif + +#ifndef SECCOMP_FILTER_FLAG_SPEC_ALLOW +#define SECCOMP_FILTER_FLAG_SPEC_ALLOW (1UL << 2) #endif
#ifndef seccomp @@ -1576,6 +1580,78 @@ TEST(seccomp_syscall_mode_lock) } }
+/* + * Test detection of known and unknown filter flags. Userspace needs to be able + * to check if a filter flag is supported by the current kernel and a good way + * of doing that is by attempting to enter filter mode, with the flag bit in + * question set, and a NULL pointer for the _args_ parameter. EFAULT indicates + * that the flag is valid and EINVAL indicates that the flag is invalid. + */ +TEST(detect_seccomp_filter_flags) +{ + unsigned int flags[] = { SECCOMP_FILTER_FLAG_TSYNC, + SECCOMP_FILTER_FLAG_SPEC_ALLOW }; + unsigned int flag, all_flags; + int i; + long ret; + + /* Test detection of known-good filter flags */ + for (i = 0, all_flags = 0; i < ARRAY_SIZE(flags); i++) { + int bits = 0; + + flag = flags[i]; + /* Make sure the flag is a single bit! */ + while (flag) { + if (flag & 0x1) + bits ++; + flag >>= 1; + } + ASSERT_EQ(1, bits); + flag = flags[i]; + + ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL); + ASSERT_NE(ENOSYS, errno) { + TH_LOG("Kernel does not support seccomp syscall!"); + } + EXPECT_EQ(-1, ret); + EXPECT_EQ(EFAULT, errno) { + TH_LOG("Failed to detect that a known-good filter flag (0x%X) is supported!", + flag); + } + + all_flags |= flag; + } + + /* Test detection of all known-good filter flags */ + ret = seccomp(SECCOMP_SET_MODE_FILTER, all_flags, NULL); + EXPECT_EQ(-1, ret); + EXPECT_EQ(EFAULT, errno) { + TH_LOG("Failed to detect that all known-good filter flags (0x%X) are supported!", + all_flags); + } + + /* Test detection of an unknown filter flag */ + flag = -1; + ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL); + EXPECT_EQ(-1, ret); + EXPECT_EQ(EINVAL, errno) { + TH_LOG("Failed to detect that an unknown filter flag (0x%X) is unsupported!", + flag); + } + + /* + * Test detection of an unknown filter flag that may simply need to be + * added to this test + */ + flag = flags[ARRAY_SIZE(flags) - 1] << 1; + ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL); + EXPECT_EQ(-1, ret); + EXPECT_EQ(EINVAL, errno) { + TH_LOG("Failed to detect that an unknown filter flag (0x%X) is unsupported! Does a new flag need to be added to this test?", + flag); + } +} + TEST(TSYNC_first) { struct sock_filter filter[] = {
From: Thomas Gleixner tglx@linutronix.de
commit 8bf37d8c067bb7eb8e7c381bdadf9bd89182b6bc upstream
The migitation control is simpler to implement in architecture code as it avoids the extra function call to check the mode. Aside of that having an explicit seccomp enabled mode in the architecture mitigations would require even more workarounds.
Move it into architecture code and provide a weak function in the seccomp code. Remove the 'which' argument as this allows the architecture to decide which mitigations are relevant for seccomp.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/bugs.c | 29 ++++++++++++++++++----------- include/linux/nospec.h | 2 ++ kernel/seccomp.c | 15 ++------------- 3 files changed, 22 insertions(+), 24 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index d6897ca..b005ef7 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -567,6 +567,24 @@ static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) return 0; }
+int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, + unsigned long ctrl) +{ + switch (which) { + case PR_SPEC_STORE_BYPASS: + return ssb_prctl_set(task, ctrl); + default: + return -ENODEV; + } +} + +#ifdef CONFIG_SECCOMP +void arch_seccomp_spec_mitigate(struct task_struct *task) +{ + ssb_prctl_set(task, PR_SPEC_FORCE_DISABLE); +} +#endif + static int ssb_prctl_get(struct task_struct *task) { switch (ssb_mode) { @@ -585,17 +603,6 @@ static int ssb_prctl_get(struct task_struct *task) } }
-int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, - unsigned long ctrl) -{ - switch (which) { - case PR_SPEC_STORE_BYPASS: - return ssb_prctl_set(task, ctrl); - default: - return -ENODEV; - } -} - int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which) { switch (which) { diff --git a/include/linux/nospec.h b/include/linux/nospec.h index a908c95..0c5ef54 100644 --- a/include/linux/nospec.h +++ b/include/linux/nospec.h @@ -62,5 +62,7 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which); int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, unsigned long ctrl); +/* Speculation control for seccomp enforced mitigation */ +void arch_seccomp_spec_mitigate(struct task_struct *task);
#endif /* _LINUX_NOSPEC_H */ diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 4bb8a5a..9a9203b 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -216,18 +216,7 @@ static inline bool seccomp_may_assign_mode(unsigned long seccomp_mode) return true; }
-/* - * If a given speculation mitigation is opt-in (prctl()-controlled), - * select it, by disabling speculation (enabling mitigation). - */ -static inline void spec_mitigate(struct task_struct *task, - unsigned long which) -{ - int state = arch_prctl_spec_ctrl_get(task, which); - - if (state > 0 && (state & PR_SPEC_PRCTL)) - arch_prctl_spec_ctrl_set(task, which, PR_SPEC_FORCE_DISABLE); -} +void __weak arch_seccomp_spec_mitigate(struct task_struct *task) { }
static inline void seccomp_assign_mode(struct task_struct *task, unsigned long seccomp_mode, @@ -243,7 +232,7 @@ static inline void seccomp_assign_mode(struct task_struct *task, smp_mb__before_atomic(); /* Assume default seccomp processes want spec flaw mitigation. */ if ((flags & SECCOMP_FILTER_FLAG_SPEC_ALLOW) == 0) - spec_mitigate(task, PR_SPEC_STORE_BYPASS); + arch_seccomp_spec_mitigate(task); set_tsk_thread_flag(task, TIF_SECCOMP); }
From: Kees Cook keescook@chromium.org
commit f21b53b20c754021935ea43364dbf53778eeba32 upstream
Unless explicitly opted out of, anything running under seccomp will have SSB mitigations enabled. Choosing the "prctl" mode will disable this.
[ tglx: Adjusted it to the new arch_seccomp_spec_mitigate() mechanism ]
Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
Documentation/kernel-parameters.txt | 26 +++++++++++++++++--------- arch/x86/include/asm/nospec-branch.h | 1 + arch/x86/kernel/cpu/bugs.c | 32 +++++++++++++++++++++++--------- 3 files changed, 41 insertions(+), 18 deletions(-)
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 80202de..3fd53e1 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -3647,19 +3647,27 @@ bytes respectively. Such letter suffixes can also be entirely omitted. This parameter controls whether the Speculative Store Bypass optimization is used.
- on - Unconditionally disable Speculative Store Bypass - off - Unconditionally enable Speculative Store Bypass - auto - Kernel detects whether the CPU model contains an - implementation of Speculative Store Bypass and - picks the most appropriate mitigation. - prctl - Control Speculative Store Bypass per thread - via prctl. Speculative Store Bypass is enabled - for a process by default. The state of the control - is inherited on fork. + on - Unconditionally disable Speculative Store Bypass + off - Unconditionally enable Speculative Store Bypass + auto - Kernel detects whether the CPU model contains an + implementation of Speculative Store Bypass and + picks the most appropriate mitigation. If the + CPU is not vulnerable, "off" is selected. If the + CPU is vulnerable the default mitigation is + architecture and Kconfig dependent. See below. + prctl - Control Speculative Store Bypass per thread + via prctl. Speculative Store Bypass is enabled + for a process by default. The state of the control + is inherited on fork. + seccomp - Same as "prctl" above, but all seccomp threads + will disable SSB unless they explicitly opt out.
Not specifying this option is equivalent to spec_store_bypass_disable=auto.
+ Default mitigations: + X86: If CONFIG_SECCOMP=y "seccomp", otherwise "prctl" + spia_io_base= [HW,MTD] spia_fio_base= spia_pedr= diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 155d955..930c1594 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -188,6 +188,7 @@ enum ssb_mitigation { SPEC_STORE_BYPASS_NONE, SPEC_STORE_BYPASS_DISABLE, SPEC_STORE_BYPASS_PRCTL, + SPEC_STORE_BYPASS_SECCOMP, };
extern char __indirect_thunk_start[]; diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index b005ef7..6fd3fcf 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -414,22 +414,25 @@ enum ssb_mitigation_cmd { SPEC_STORE_BYPASS_CMD_AUTO, SPEC_STORE_BYPASS_CMD_ON, SPEC_STORE_BYPASS_CMD_PRCTL, + SPEC_STORE_BYPASS_CMD_SECCOMP, };
static const char *ssb_strings[] = { [SPEC_STORE_BYPASS_NONE] = "Vulnerable", [SPEC_STORE_BYPASS_DISABLE] = "Mitigation: Speculative Store Bypass disabled", - [SPEC_STORE_BYPASS_PRCTL] = "Mitigation: Speculative Store Bypass disabled via prctl" + [SPEC_STORE_BYPASS_PRCTL] = "Mitigation: Speculative Store Bypass disabled via prctl", + [SPEC_STORE_BYPASS_SECCOMP] = "Mitigation: Speculative Store Bypass disabled via prctl and seccomp", };
static const struct { const char *option; enum ssb_mitigation_cmd cmd; } ssb_mitigation_options[] = { - { "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */ - { "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */ - { "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */ - { "prctl", SPEC_STORE_BYPASS_CMD_PRCTL }, /* Disable Speculative Store Bypass via prctl */ + { "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */ + { "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */ + { "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */ + { "prctl", SPEC_STORE_BYPASS_CMD_PRCTL }, /* Disable Speculative Store Bypass via prctl */ + { "seccomp", SPEC_STORE_BYPASS_CMD_SECCOMP }, /* Disable Speculative Store Bypass via prctl and seccomp */ };
static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void) @@ -479,8 +482,15 @@ static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void)
switch (cmd) { case SPEC_STORE_BYPASS_CMD_AUTO: - /* Choose prctl as the default mode */ - mode = SPEC_STORE_BYPASS_PRCTL; + case SPEC_STORE_BYPASS_CMD_SECCOMP: + /* + * Choose prctl+seccomp as the default mode if seccomp is + * enabled. + */ + if (IS_ENABLED(CONFIG_SECCOMP)) + mode = SPEC_STORE_BYPASS_SECCOMP; + else + mode = SPEC_STORE_BYPASS_PRCTL; break; case SPEC_STORE_BYPASS_CMD_ON: mode = SPEC_STORE_BYPASS_DISABLE; @@ -528,12 +538,14 @@ static void ssb_select_mitigation() }
#undef pr_fmt +#define pr_fmt(fmt) "Speculation prctl: " fmt
static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) { bool update;
- if (ssb_mode != SPEC_STORE_BYPASS_PRCTL) + if (ssb_mode != SPEC_STORE_BYPASS_PRCTL && + ssb_mode != SPEC_STORE_BYPASS_SECCOMP) return -ENXIO;
switch (ctrl) { @@ -581,7 +593,8 @@ int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, #ifdef CONFIG_SECCOMP void arch_seccomp_spec_mitigate(struct task_struct *task) { - ssb_prctl_set(task, PR_SPEC_FORCE_DISABLE); + if (ssb_mode == SPEC_STORE_BYPASS_SECCOMP) + ssb_prctl_set(task, PR_SPEC_FORCE_DISABLE); } #endif
@@ -590,6 +603,7 @@ static int ssb_prctl_get(struct task_struct *task) switch (ssb_mode) { case SPEC_STORE_BYPASS_DISABLE: return PR_SPEC_DISABLE; + case SPEC_STORE_BYPASS_SECCOMP: case SPEC_STORE_BYPASS_PRCTL: if (task_spec_ssb_force_disable(task)) return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 9f65fb29374ee37856dbad847b4e121aab72b510 upstream
Intel collateral will reference the SSB mitigation bit in IA32_SPEC_CTL[2] as SSBD (Speculative Store Bypass Disable).
Hence changing it.
It is unclear yet what the MSR_IA32_ARCH_CAPABILITIES (0x10a) Bit(4) name is going to be. Following the rename it would be SSBD_NO but that rolls out to Speculative Store Bypass Disable No.
Also fixed the missing space in X86_FEATURE_AMD_SSBD.
[ tglx: Fixup x86_amd_rds_enable() and rds_tif_to_amd_ls_cfg() as well ]
Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [ Srivatsa: Backported to 4.4.y, skipping the KVM changes in this patch. ] Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 4 ++-- arch/x86/include/asm/msr-index.h | 10 +++++----- arch/x86/include/asm/spec-ctrl.h | 12 ++++++------ arch/x86/include/asm/thread_info.h | 6 +++--- arch/x86/kernel/cpu/amd.c | 14 +++++++------- arch/x86/kernel/cpu/bugs.c | 36 ++++++++++++++++++------------------ arch/x86/kernel/cpu/common.c | 2 +- arch/x86/kernel/cpu/intel.c | 2 +- arch/x86/kernel/process.c | 8 ++++---- 9 files changed, 47 insertions(+), 47 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index b7cdd1c..97926be 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -204,7 +204,7 @@ #define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled*/ #define X86_FEATURE_USE_IBRS_FW ( 7*32+22) /* "" Use IBRS during runtime firmware calls */ #define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE ( 7*32+23) /* "" Disable Speculative Store Bypass. */ -#define X86_FEATURE_AMD_RDS (7*32+24) /* "" AMD RDS implementation */ +#define X86_FEATURE_AMD_SSBD (7*32+24) /* "" AMD SSBD implementation */
/* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ @@ -299,7 +299,7 @@ #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ #define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */ -#define X86_FEATURE_RDS (18*32+31) /* Reduced Data Speculation */ +#define X86_FEATURE_SSBD (18*32+31) /* Speculative Store Bypass Disable */
/* * BUG word(s) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 883cf0d..2ea2ff1 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -35,8 +35,8 @@ #define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */ #define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */ #define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */ -#define SPEC_CTRL_RDS_SHIFT 2 /* Reduced Data Speculation bit */ -#define SPEC_CTRL_RDS (1 << SPEC_CTRL_RDS_SHIFT) /* Reduced Data Speculation */ +#define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */ +#define SPEC_CTRL_SSBD (1 << SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */ #define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */ @@ -58,10 +58,10 @@ #define MSR_IA32_ARCH_CAPABILITIES 0x0000010a #define ARCH_CAP_RDCL_NO (1 << 0) /* Not susceptible to Meltdown */ #define ARCH_CAP_IBRS_ALL (1 << 1) /* Enhanced IBRS support */ -#define ARCH_CAP_RDS_NO (1 << 4) /* +#define ARCH_CAP_SSBD_NO (1 << 4) /* * Not susceptible to Speculative Store Bypass - * attack, so no Reduced Data Speculation control - * required. + * attack, so no Speculative Store Bypass + * control required. */
#define MSR_IA32_BBL_CR_CTL 0x00000119 diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h index 45ef00a..dc21209 100644 --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -17,20 +17,20 @@ extern void x86_spec_ctrl_restore_host(u64);
/* AMD specific Speculative Store Bypass MSR data */ extern u64 x86_amd_ls_cfg_base; -extern u64 x86_amd_ls_cfg_rds_mask; +extern u64 x86_amd_ls_cfg_ssbd_mask;
/* The Intel SPEC CTRL MSR base value cache */ extern u64 x86_spec_ctrl_base;
-static inline u64 rds_tif_to_spec_ctrl(u64 tifn) +static inline u64 ssbd_tif_to_spec_ctrl(u64 tifn) { - BUILD_BUG_ON(TIF_RDS < SPEC_CTRL_RDS_SHIFT); - return (tifn & _TIF_RDS) >> (TIF_RDS - SPEC_CTRL_RDS_SHIFT); + BUILD_BUG_ON(TIF_SSBD < SPEC_CTRL_SSBD_SHIFT); + return (tifn & _TIF_SSBD) >> (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT); }
-static inline u64 rds_tif_to_amd_ls_cfg(u64 tifn) +static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn) { - return (tifn & _TIF_RDS) ? x86_amd_ls_cfg_rds_mask : 0ULL; + return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL; }
extern void speculative_store_bypass_update(void); diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 36a49b4..a96e88b 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -92,7 +92,7 @@ struct thread_info { #define TIF_SIGPENDING 2 /* signal pending */ #define TIF_NEED_RESCHED 3 /* rescheduling necessary */ #define TIF_SINGLESTEP 4 /* reenable singlestep on user return*/ -#define TIF_RDS 5 /* Reduced data speculation */ +#define TIF_SSBD 5 /* Reduced data speculation */ #define TIF_SYSCALL_EMU 6 /* syscall emulation active */ #define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */ #define TIF_SECCOMP 8 /* secure computing */ @@ -117,7 +117,7 @@ struct thread_info { #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) #define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP) -#define _TIF_RDS (1 << TIF_RDS) +#define _TIF_SSBD (1 << TIF_SSBD) #define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU) #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) #define _TIF_SECCOMP (1 << TIF_SECCOMP) @@ -149,7 +149,7 @@ struct thread_info {
/* flags to check in __switch_to() */ #define _TIF_WORK_CTXSW \ - (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP|_TIF_RDS) + (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP|_TIF_SSBD)
#define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY) #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 14e9849..bd0edb2 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -532,12 +532,12 @@ static void bsp_init_amd(struct cpuinfo_x86 *c) } /* * Try to cache the base value so further operations can - * avoid RMW. If that faults, do not enable RDS. + * avoid RMW. If that faults, do not enable SSBD. */ if (!rdmsrl_safe(MSR_AMD64_LS_CFG, &x86_amd_ls_cfg_base)) { - setup_force_cpu_cap(X86_FEATURE_RDS); - setup_force_cpu_cap(X86_FEATURE_AMD_RDS); - x86_amd_ls_cfg_rds_mask = 1ULL << bit; + setup_force_cpu_cap(X86_FEATURE_SSBD); + setup_force_cpu_cap(X86_FEATURE_AMD_SSBD); + x86_amd_ls_cfg_ssbd_mask = 1ULL << bit; } } } @@ -816,9 +816,9 @@ static void init_amd(struct cpuinfo_x86 *c) if (!cpu_has(c, X86_FEATURE_XENPV)) set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS);
- if (boot_cpu_has(X86_FEATURE_AMD_RDS)) { - set_cpu_cap(c, X86_FEATURE_RDS); - set_cpu_cap(c, X86_FEATURE_AMD_RDS); + if (boot_cpu_has(X86_FEATURE_AMD_SSBD)) { + set_cpu_cap(c, X86_FEATURE_SSBD); + set_cpu_cap(c, X86_FEATURE_AMD_SSBD); } }
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 6fd3fcf..812e92a 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -44,10 +44,10 @@ static u64 x86_spec_ctrl_mask = ~SPEC_CTRL_IBRS;
/* * AMD specific MSR info for Speculative Store Bypass control. - * x86_amd_ls_cfg_rds_mask is initialized in identify_boot_cpu(). + * x86_amd_ls_cfg_ssbd_mask is initialized in identify_boot_cpu(). */ u64 x86_amd_ls_cfg_base; -u64 x86_amd_ls_cfg_rds_mask; +u64 x86_amd_ls_cfg_ssbd_mask;
void __init check_bugs(void) { @@ -144,7 +144,7 @@ u64 x86_spec_ctrl_get_default(void) u64 msrval = x86_spec_ctrl_base;
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) - msrval |= rds_tif_to_spec_ctrl(current_thread_info()->flags); + msrval |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags); return msrval; } EXPORT_SYMBOL_GPL(x86_spec_ctrl_get_default); @@ -157,7 +157,7 @@ void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl) return;
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) - host |= rds_tif_to_spec_ctrl(current_thread_info()->flags); + host |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags);
if (host != guest_spec_ctrl) wrmsrl(MSR_IA32_SPEC_CTRL, guest_spec_ctrl); @@ -172,18 +172,18 @@ void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl) return;
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) - host |= rds_tif_to_spec_ctrl(current_thread_info()->flags); + host |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags);
if (host != guest_spec_ctrl) wrmsrl(MSR_IA32_SPEC_CTRL, host); } EXPORT_SYMBOL_GPL(x86_spec_ctrl_restore_host);
-static void x86_amd_rds_enable(void) +static void x86_amd_ssb_disable(void) { - u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_mask; + u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_ssbd_mask;
- if (boot_cpu_has(X86_FEATURE_AMD_RDS)) + if (boot_cpu_has(X86_FEATURE_AMD_SSBD)) wrmsrl(MSR_AMD64_LS_CFG, msrval); }
@@ -471,7 +471,7 @@ static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void) enum ssb_mitigation mode = SPEC_STORE_BYPASS_NONE; enum ssb_mitigation_cmd cmd;
- if (!boot_cpu_has(X86_FEATURE_RDS)) + if (!boot_cpu_has(X86_FEATURE_SSBD)) return mode;
cmd = ssb_parse_cmdline(); @@ -505,7 +505,7 @@ static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void) /* * We have three CPU feature flags that are in play here: * - X86_BUG_SPEC_STORE_BYPASS - CPU is susceptible. - * - X86_FEATURE_RDS - CPU is able to turn off speculative store bypass + * - X86_FEATURE_SSBD - CPU is able to turn off speculative store bypass * - X86_FEATURE_SPEC_STORE_BYPASS_DISABLE - engage the mitigation */ if (mode == SPEC_STORE_BYPASS_DISABLE) { @@ -516,12 +516,12 @@ static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void) */ switch (boot_cpu_data.x86_vendor) { case X86_VENDOR_INTEL: - x86_spec_ctrl_base |= SPEC_CTRL_RDS; - x86_spec_ctrl_mask &= ~SPEC_CTRL_RDS; - x86_spec_ctrl_set(SPEC_CTRL_RDS); + x86_spec_ctrl_base |= SPEC_CTRL_SSBD; + x86_spec_ctrl_mask &= ~SPEC_CTRL_SSBD; + x86_spec_ctrl_set(SPEC_CTRL_SSBD); break; case X86_VENDOR_AMD: - x86_amd_rds_enable(); + x86_amd_ssb_disable(); break; } } @@ -554,16 +554,16 @@ static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) if (task_spec_ssb_force_disable(task)) return -EPERM; task_clear_spec_ssb_disable(task); - update = test_and_clear_tsk_thread_flag(task, TIF_RDS); + update = test_and_clear_tsk_thread_flag(task, TIF_SSBD); break; case PR_SPEC_DISABLE: task_set_spec_ssb_disable(task); - update = !test_and_set_tsk_thread_flag(task, TIF_RDS); + update = !test_and_set_tsk_thread_flag(task, TIF_SSBD); break; case PR_SPEC_FORCE_DISABLE: task_set_spec_ssb_disable(task); task_set_spec_ssb_force_disable(task); - update = !test_and_set_tsk_thread_flag(task, TIF_RDS); + update = !test_and_set_tsk_thread_flag(task, TIF_SSBD); break; default: return -ERANGE; @@ -633,7 +633,7 @@ void x86_spec_ctrl_setup_ap(void) x86_spec_ctrl_set(x86_spec_ctrl_base & ~x86_spec_ctrl_mask);
if (ssb_mode == SPEC_STORE_BYPASS_DISABLE) - x86_amd_rds_enable(); + x86_amd_ssb_disable(); }
#ifdef CONFIG_SYSFS diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 7405c86..6f3a5d7 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -867,7 +867,7 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
if (!x86_match_cpu(cpu_no_spec_store_bypass) && - !(ia32_cap & ARCH_CAP_RDS_NO)) + !(ia32_cap & ARCH_CAP_SSBD_NO)) setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
if (x86_match_cpu(cpu_no_speculation)) diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index ac25d1e5..a34e357 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -119,7 +119,7 @@ static void early_init_intel(struct cpuinfo_x86 *c) setup_clear_cpu_cap(X86_FEATURE_STIBP); setup_clear_cpu_cap(X86_FEATURE_SPEC_CTRL); setup_clear_cpu_cap(X86_FEATURE_INTEL_STIBP); - setup_clear_cpu_cap(X86_FEATURE_RDS); + setup_clear_cpu_cap(X86_FEATURE_SSBD); }
/* diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 9689e92..57d4ba2 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -203,11 +203,11 @@ static __always_inline void __speculative_store_bypass_update(unsigned long tifn { u64 msr;
- if (static_cpu_has(X86_FEATURE_AMD_RDS)) { - msr = x86_amd_ls_cfg_base | rds_tif_to_amd_ls_cfg(tifn); + if (static_cpu_has(X86_FEATURE_AMD_SSBD)) { + msr = x86_amd_ls_cfg_base | ssbd_tif_to_amd_ls_cfg(tifn); wrmsrl(MSR_AMD64_LS_CFG, msr); } else { - msr = x86_spec_ctrl_base | rds_tif_to_spec_ctrl(tifn); + msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn); wrmsrl(MSR_IA32_SPEC_CTRL, msr); } } @@ -246,7 +246,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, if ((tifp ^ tifn) & _TIF_NOTSC) cr4_toggle_bits(X86_CR4_TSD);
- if ((tifp ^ tifn) & _TIF_RDS) + if ((tifp ^ tifn) & _TIF_SSBD) __speculative_store_bypass_update(tifn); }
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit e96f46ee8587607a828f783daa6eb5b44d25004d upstream
The style for the 'status' file is CamelCase or this. _.
Fixes: fae1fa0fc ("proc: Provide details on speculation flaw mitigations") Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
fs/proc/array.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/proc/array.c b/fs/proc/array.c index 3141478..cb71cba 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -333,7 +333,7 @@ static inline void task_seccomp(struct seq_file *m, struct task_struct *p) #ifdef CONFIG_SECCOMP seq_printf(m, "Seccomp:\t%d\n", p->seccomp.mode); #endif - seq_printf(m, "\nSpeculation Store Bypass:\t"); + seq_printf(m, "\nSpeculation_Store_Bypass:\t"); switch (arch_prctl_spec_ctrl_get(p, PR_SPEC_STORE_BYPASS)) { case -EINVAL: seq_printf(m, "unknown");
From: Borislav Petkov bp@suse.de
commit dd0792699c4058e63c0715d9a7c2d40226fcdddc upstream
Fix some typos, improve formulations, end sentences with a fullstop.
Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
Documentation/spec_ctrl.txt | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/Documentation/spec_ctrl.txt b/Documentation/spec_ctrl.txt index 1b3690d..32f3d55 100644 --- a/Documentation/spec_ctrl.txt +++ b/Documentation/spec_ctrl.txt @@ -2,13 +2,13 @@ Speculation Control ===================
-Quite some CPUs have speculation related misfeatures which are in fact -vulnerabilites causing data leaks in various forms even accross privilege -domains. +Quite some CPUs have speculation-related misfeatures which are in +fact vulnerabilities causing data leaks in various forms even across +privilege domains.
The kernel provides mitigation for such vulnerabilities in various -forms. Some of these mitigations are compile time configurable and some on -the kernel command line. +forms. Some of these mitigations are compile-time configurable and some +can be supplied on the kernel command line.
There is also a class of mitigations which are very expensive, but they can be restricted to a certain set of processes or tasks in controlled @@ -32,18 +32,18 @@ the following meaning: Bit Define Description ==== ===================== =================================================== 0 PR_SPEC_PRCTL Mitigation can be controlled per task by - PR_SET_SPECULATION_CTRL + PR_SET_SPECULATION_CTRL. 1 PR_SPEC_ENABLE The speculation feature is enabled, mitigation is - disabled + disabled. 2 PR_SPEC_DISABLE The speculation feature is disabled, mitigation is - enabled + enabled. 3 PR_SPEC_FORCE_DISABLE Same as PR_SPEC_DISABLE, but cannot be undone. A subsequent prctl(..., PR_SPEC_ENABLE) will fail. ==== ===================== ===================================================
If all bits are 0 the CPU is not affected by the speculation misfeature.
-If PR_SPEC_PRCTL is set, then the per task control of the mitigation is +If PR_SPEC_PRCTL is set, then the per-task control of the mitigation is available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation misfeature will fail.
@@ -61,9 +61,9 @@ Common error codes Value Meaning ======= ================================================================= EINVAL The prctl is not implemented by the architecture or unused - prctl(2) arguments are not 0 + prctl(2) arguments are not 0.
-ENODEV arg2 is selecting a not supported speculation misfeature +ENODEV arg2 is selecting a not supported speculation misfeature. ======= =================================================================
PR_SET_SPECULATION_CTRL error codes @@ -74,7 +74,7 @@ Value Meaning 0 Success
ERANGE arg3 is incorrect, i.e. it's neither PR_SPEC_ENABLE nor - PR_SPEC_DISABLE nor PR_SPEC_FORCE_DISABLE + PR_SPEC_DISABLE nor PR_SPEC_FORCE_DISABLE.
ENXIO Control of the selected speculation misfeature is not possible. See PR_GET_SPECULATION_CTRL.
From: Jiri Kosina jkosina@suse.cz
commit d66d8ff3d21667b41eddbe86b35ab411e40d8c5f upstream
__ssb_select_mitigation() returns one of the members of enum ssb_mitigation, not ssb_mitigation_cmd; fix the prototype to reflect that.
Fixes: 24f7fc83b9204 ("x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation") Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/bugs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 812e92a..5b58b76 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -466,7 +466,7 @@ static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void) return cmd; }
-static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void) +static enum ssb_mitigation __init __ssb_select_mitigation(void) { enum ssb_mitigation mode = SPEC_STORE_BYPASS_NONE; enum ssb_mitigation_cmd cmd;
From: Jiri Kosina jkosina@suse.cz
commit 7bb4d366cba992904bffa4820d24e70a3de93e76 upstream
cpu_show_common() is not used outside of arch/x86/kernel/cpu/bugs.c, so make it static.
Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/bugs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 5b58b76..512be68 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -638,7 +638,7 @@ void x86_spec_ctrl_setup_ap(void)
#ifdef CONFIG_SYSFS
-ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, +static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, char *buf, unsigned int bug) { if (!boot_cpu_has_bug(bug))
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit ffed645e3be0e32f8e9ab068d257aee8d0fe8eec upstream
Fixes: 7bb4d366c ("x86/bugs: Make cpu_show_common() static") Fixes: 24f7fc83b ("x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation") Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/bugs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 512be68..84de0fc 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -529,7 +529,7 @@ static enum ssb_mitigation __init __ssb_select_mitigation(void) return mode; }
-static void ssb_select_mitigation() +static void ssb_select_mitigation(void) { ssb_mode = __ssb_select_mitigation();
@@ -639,7 +639,7 @@ void x86_spec_ctrl_setup_ap(void) #ifdef CONFIG_SYSFS
static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, - char *buf, unsigned int bug) + char *buf, unsigned int bug) { if (!boot_cpu_has_bug(bug)) return sprintf(buf, "Not affected\n");
From: Jim Mattson jmattson@google.com
commit 5f2b745f5e1304f438f9b2cd03ebc8120b6e0d3b upstream
Cast val and (val >> 32) to (u32), so that they fit in a general-purpose register in both 32-bit and 64-bit code.
[ tglx: Made it u32 instead of uintptr_t ]
Fixes: c65732e4f721 ("x86/cpu: Restore CPUID_8000_0008_EBX reload") Signed-off-by: Jim Mattson jmattson@google.com Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/nospec-branch.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 930c1594..640c11b 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -219,8 +219,8 @@ void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature) { asm volatile(ALTERNATIVE("", "wrmsr", %c[feature]) : : "c" (msr), - "a" (val), - "d" (val >> 32), + "a" ((u32)val), + "d" ((u32)(val >> 32)), [feature] "i" (feature) : "memory"); }
From: Borislav Petkov bp@suse.de
commit e7c587da125291db39ddf1f49b18e5970adbac17 upstream
Intel and AMD have different CPUID bits hence for those use synthetic bits which get set on the respective vendor's in init_speculation_control(). So that debacles like what the commit message of
c65732e4f721 ("x86/cpu: Restore CPUID_8000_0008_EBX reload")
talks about don't happen anymore.
Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Tested-by: Jörg Otte jrg.otte@gmail.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Link: https://lkml.kernel.org/r/20180504161815.GG9257@pd.tnic Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [ Srivatsa: Backported to 4.4.y, skipping the KVM changes in this patch. ] Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 12 ++++++++---- arch/x86/kernel/cpu/common.c | 14 ++++++++++---- 2 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 97926be..9f64d10 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -204,7 +204,10 @@ #define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled*/ #define X86_FEATURE_USE_IBRS_FW ( 7*32+22) /* "" Use IBRS during runtime firmware calls */ #define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE ( 7*32+23) /* "" Disable Speculative Store Bypass. */ -#define X86_FEATURE_AMD_SSBD (7*32+24) /* "" AMD SSBD implementation */ +#define X86_FEATURE_AMD_SSBD ( 7*32+24) /* "" AMD SSBD implementation */ +#define X86_FEATURE_IBRS ( 7*32+25) /* Indirect Branch Restricted Speculation */ +#define X86_FEATURE_IBPB ( 7*32+26) /* Indirect Branch Prediction Barrier */ +#define X86_FEATURE_STIBP ( 7*32+27) /* Single Thread Indirect Branch Predictors */
/* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ @@ -256,9 +259,9 @@
/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */ #define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */ -#define X86_FEATURE_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */ -#define X86_FEATURE_IBRS (13*32+14) /* Indirect Branch Restricted Speculation */ -#define X86_FEATURE_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors */ +#define X86_FEATURE_AMD_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */ +#define X86_FEATURE_AMD_IBRS (13*32+14) /* Indirect Branch Restricted Speculation */ +#define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors */
/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */ #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */ @@ -293,6 +296,7 @@ #define X86_FEATURE_SUCCOR (17*32+1) /* Uncorrectable error containment and recovery */ #define X86_FEATURE_SMCA (17*32+3) /* Scalable MCA */
+ /* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */ #define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */ #define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 6f3a5d7..f2b579f 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -683,17 +683,23 @@ static void init_speculation_control(struct cpuinfo_x86 *c) * and they also have a different bit for STIBP support. Also, * a hypervisor might have set the individual AMD bits even on * Intel CPUs, for finer-grained selection of what's available. - * - * We use the AMD bits in 0x8000_0008 EBX as the generic hardware - * features, which are visible in /proc/cpuinfo and used by the - * kernel. So set those accordingly from the Intel bits. */ if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) { set_cpu_cap(c, X86_FEATURE_IBRS); set_cpu_cap(c, X86_FEATURE_IBPB); } + if (cpu_has(c, X86_FEATURE_INTEL_STIBP)) set_cpu_cap(c, X86_FEATURE_STIBP); + + if (cpu_has(c, X86_FEATURE_AMD_IBRS)) + set_cpu_cap(c, X86_FEATURE_IBRS); + + if (cpu_has(c, X86_FEATURE_AMD_IBPB)) + set_cpu_cap(c, X86_FEATURE_IBPB); + + if (cpu_has(c, X86_FEATURE_AMD_STIBP)) + set_cpu_cap(c, X86_FEATURE_STIBP); }
void get_cpu_cap(struct cpuinfo_x86 *c)
From: Thomas Gleixner tglx@linutronix.de
commit 7eb8956a7fec3c1f0abc2a5517dada99ccc8a961 upstream
The availability of the SPEC_CTRL MSR is enumerated by a CPUID bit on Intel and implied by IBRS or STIBP support on AMD. That's just confusing and in case an AMD CPU has IBRS not supported because the underlying problem has been fixed but has another bit valid in the SPEC_CTRL MSR, the thing falls apart.
Add a synthetic feature bit X86_FEATURE_MSR_SPEC_CTRL to denote the availability on both Intel and AMD.
While at it replace the boot_cpu_has() checks with static_cpu_has() where possible. This prevents late microcode loading from exposing SPEC_CTRL, but late loading is already very limited as it does not reevaluate the mitigation options and other bits and pieces. Having static_cpu_has() is the simplest and least fragile solution.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 3 +++ arch/x86/kernel/cpu/bugs.c | 18 +++++++++++------- arch/x86/kernel/cpu/common.c | 9 +++++++-- arch/x86/kernel/cpu/intel.c | 1 + 4 files changed, 22 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 9f64d10..dd04cd7 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -198,6 +198,9 @@
#define X86_FEATURE_RETPOLINE ( 7*32+29) /* "" Generic Retpoline mitigation for Spectre variant 2 */ #define X86_FEATURE_RETPOLINE_AMD ( 7*32+30) /* "" AMD Retpoline mitigation for Spectre variant 2 */ + +#define X86_FEATURE_MSR_SPEC_CTRL ( 7*32+16) /* "" MSR SPEC_CTRL is implemented */ + /* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */ #define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 84de0fc..e23e289 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -63,7 +63,7 @@ void __init check_bugs(void) * have unknown values. AMD64_LS_CFG MSR is cached in the early AMD * init code as it is not enumerated and depends on the family. */ - if (boot_cpu_has(X86_FEATURE_IBRS)) + if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
/* Select the proper spectre mitigation before patching alternatives */ @@ -143,7 +143,7 @@ u64 x86_spec_ctrl_get_default(void) { u64 msrval = x86_spec_ctrl_base;
- if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) + if (static_cpu_has(X86_FEATURE_SPEC_CTRL)) msrval |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags); return msrval; } @@ -153,10 +153,12 @@ void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl) { u64 host = x86_spec_ctrl_base;
- if (!boot_cpu_has(X86_FEATURE_IBRS)) + /* Is MSR_SPEC_CTRL implemented ? */ + if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) return;
- if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) + /* Intel controls SSB in MSR_SPEC_CTRL */ + if (static_cpu_has(X86_FEATURE_SPEC_CTRL)) host |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags);
if (host != guest_spec_ctrl) @@ -168,10 +170,12 @@ void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl) { u64 host = x86_spec_ctrl_base;
- if (!boot_cpu_has(X86_FEATURE_IBRS)) + /* Is MSR_SPEC_CTRL implemented ? */ + if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) return;
- if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) + /* Intel controls SSB in MSR_SPEC_CTRL */ + if (static_cpu_has(X86_FEATURE_SPEC_CTRL)) host |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags);
if (host != guest_spec_ctrl) @@ -629,7 +633,7 @@ int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)
void x86_spec_ctrl_setup_ap(void) { - if (boot_cpu_has(X86_FEATURE_IBRS)) + if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) x86_spec_ctrl_set(x86_spec_ctrl_base & ~x86_spec_ctrl_mask);
if (ssb_mode == SPEC_STORE_BYPASS_DISABLE) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index f2b579f..1f70ff1 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -687,19 +687,24 @@ static void init_speculation_control(struct cpuinfo_x86 *c) if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) { set_cpu_cap(c, X86_FEATURE_IBRS); set_cpu_cap(c, X86_FEATURE_IBPB); + set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL); }
if (cpu_has(c, X86_FEATURE_INTEL_STIBP)) set_cpu_cap(c, X86_FEATURE_STIBP);
- if (cpu_has(c, X86_FEATURE_AMD_IBRS)) + if (cpu_has(c, X86_FEATURE_AMD_IBRS)) { set_cpu_cap(c, X86_FEATURE_IBRS); + set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL); + }
if (cpu_has(c, X86_FEATURE_AMD_IBPB)) set_cpu_cap(c, X86_FEATURE_IBPB);
- if (cpu_has(c, X86_FEATURE_AMD_STIBP)) + if (cpu_has(c, X86_FEATURE_AMD_STIBP)) { set_cpu_cap(c, X86_FEATURE_STIBP); + set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL); + } }
void get_cpu_cap(struct cpuinfo_x86 *c) diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index a34e357..9a84e75 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -118,6 +118,7 @@ static void early_init_intel(struct cpuinfo_x86 *c) setup_clear_cpu_cap(X86_FEATURE_IBPB); setup_clear_cpu_cap(X86_FEATURE_STIBP); setup_clear_cpu_cap(X86_FEATURE_SPEC_CTRL); + setup_clear_cpu_cap(X86_FEATURE_MSR_SPEC_CTRL); setup_clear_cpu_cap(X86_FEATURE_INTEL_STIBP); setup_clear_cpu_cap(X86_FEATURE_SSBD); }
From: Thomas Gleixner tglx@linutronix.de
commit 52817587e706686fcdb27f14c1b000c92f266c96 upstream
The SSBD enumeration is similarly to the other bits magically shared between Intel and AMD though the mechanisms are different.
Make X86_FEATURE_SSBD synthetic and set it depending on the vendor specific features or family dependent setup.
Change the Intel bit to X86_FEATURE_SPEC_CTRL_SSBD to denote that SSBD is controlled via MSR_SPEC_CTRL and fix up the usage sites.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 6 ++++-- arch/x86/kernel/cpu/amd.c | 7 +------ arch/x86/kernel/cpu/bugs.c | 10 +++++----- arch/x86/kernel/cpu/common.c | 3 +++ arch/x86/kernel/cpu/intel.c | 1 + arch/x86/kernel/process.c | 2 +- 6 files changed, 15 insertions(+), 14 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index dd04cd7..7dd1bba 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -200,6 +200,7 @@ #define X86_FEATURE_RETPOLINE_AMD ( 7*32+30) /* "" AMD Retpoline mitigation for Spectre variant 2 */
#define X86_FEATURE_MSR_SPEC_CTRL ( 7*32+16) /* "" MSR SPEC_CTRL is implemented */ +#define X86_FEATURE_SSBD ( 7*32+17) /* Speculative Store Bypass Disable */
/* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */ #define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */ @@ -207,7 +208,8 @@ #define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled*/ #define X86_FEATURE_USE_IBRS_FW ( 7*32+22) /* "" Use IBRS during runtime firmware calls */ #define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE ( 7*32+23) /* "" Disable Speculative Store Bypass. */ -#define X86_FEATURE_AMD_SSBD ( 7*32+24) /* "" AMD SSBD implementation */ +#define X86_FEATURE_LS_CFG_SSBD ( 7*32+24) /* "" AMD SSBD implementation */ + #define X86_FEATURE_IBRS ( 7*32+25) /* Indirect Branch Restricted Speculation */ #define X86_FEATURE_IBPB ( 7*32+26) /* Indirect Branch Prediction Barrier */ #define X86_FEATURE_STIBP ( 7*32+27) /* Single Thread Indirect Branch Predictors */ @@ -306,7 +308,7 @@ #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ #define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */ -#define X86_FEATURE_SSBD (18*32+31) /* Speculative Store Bypass Disable */ +#define X86_FEATURE_SPEC_CTRL_SSBD (18*32+31) /* "" Speculative Store Bypass Disable */
/* * BUG word(s) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index bd0edb2..a97fd67 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -535,8 +535,8 @@ static void bsp_init_amd(struct cpuinfo_x86 *c) * avoid RMW. If that faults, do not enable SSBD. */ if (!rdmsrl_safe(MSR_AMD64_LS_CFG, &x86_amd_ls_cfg_base)) { + setup_force_cpu_cap(X86_FEATURE_LS_CFG_SSBD); setup_force_cpu_cap(X86_FEATURE_SSBD); - setup_force_cpu_cap(X86_FEATURE_AMD_SSBD); x86_amd_ls_cfg_ssbd_mask = 1ULL << bit; } } @@ -815,11 +815,6 @@ static void init_amd(struct cpuinfo_x86 *c) /* AMD CPUs don't reset SS attributes on SYSRET, Xen does. */ if (!cpu_has(c, X86_FEATURE_XENPV)) set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS); - - if (boot_cpu_has(X86_FEATURE_AMD_SSBD)) { - set_cpu_cap(c, X86_FEATURE_SSBD); - set_cpu_cap(c, X86_FEATURE_AMD_SSBD); - } }
#ifdef CONFIG_X86_32 diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index e23e289..9be7292 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -157,8 +157,8 @@ void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl) if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) return;
- /* Intel controls SSB in MSR_SPEC_CTRL */ - if (static_cpu_has(X86_FEATURE_SPEC_CTRL)) + /* SSBD controlled in MSR_SPEC_CTRL */ + if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD)) host |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags);
if (host != guest_spec_ctrl) @@ -174,8 +174,8 @@ void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl) if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) return;
- /* Intel controls SSB in MSR_SPEC_CTRL */ - if (static_cpu_has(X86_FEATURE_SPEC_CTRL)) + /* SSBD controlled in MSR_SPEC_CTRL */ + if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD)) host |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags);
if (host != guest_spec_ctrl) @@ -187,7 +187,7 @@ static void x86_amd_ssb_disable(void) { u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_ssbd_mask;
- if (boot_cpu_has(X86_FEATURE_AMD_SSBD)) + if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD)) wrmsrl(MSR_AMD64_LS_CFG, msrval); }
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 1f70ff1..1097723 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -693,6 +693,9 @@ static void init_speculation_control(struct cpuinfo_x86 *c) if (cpu_has(c, X86_FEATURE_INTEL_STIBP)) set_cpu_cap(c, X86_FEATURE_STIBP);
+ if (cpu_has(c, X86_FEATURE_SPEC_CTRL_SSBD)) + set_cpu_cap(c, X86_FEATURE_SSBD); + if (cpu_has(c, X86_FEATURE_AMD_IBRS)) { set_cpu_cap(c, X86_FEATURE_IBRS); set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL); diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 9a84e75..4dce22d 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -121,6 +121,7 @@ static void early_init_intel(struct cpuinfo_x86 *c) setup_clear_cpu_cap(X86_FEATURE_MSR_SPEC_CTRL); setup_clear_cpu_cap(X86_FEATURE_INTEL_STIBP); setup_clear_cpu_cap(X86_FEATURE_SSBD); + setup_clear_cpu_cap(X86_FEATURE_SPEC_CTRL_SSBD); }
/* diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 57d4ba2..8cefbd2 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -203,7 +203,7 @@ static __always_inline void __speculative_store_bypass_update(unsigned long tifn { u64 msr;
- if (static_cpu_has(X86_FEATURE_AMD_SSBD)) { + if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) { msr = x86_amd_ls_cfg_base | ssbd_tif_to_amd_ls_cfg(tifn); wrmsrl(MSR_AMD64_LS_CFG, msr); } else {
From: Borislav Petkov bp@suse.de
commit f7f3dc00f61261cdc9ccd8b886f21bc4dffd6fd9 upstream
CPUID Fn8000_0007_EDX[CPB] is wrongly 0 on models up to B1. But they do support CPB (AMD's Core Performance Boosting cpufreq CPU feature), so fix that.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Sherry Hurwitz sherry.hurwitz@amd.com Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20170907170821.16021-1-bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/amd.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index a97fd67..87f4a0d 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -713,6 +713,16 @@ static void init_amd_bd(struct cpuinfo_x86 *c) } }
+static void init_amd_zn(struct cpuinfo_x86 *c) +{ + /* + * Fix erratum 1076: CPB feature bit not being set in CPUID. It affects + * all up to and including B1. + */ + if (c->x86_model <= 1 && c->x86_mask <= 1) + set_cpu_cap(c, X86_FEATURE_CPB); +} + static void init_amd(struct cpuinfo_x86 *c) { u32 dummy; @@ -743,6 +753,7 @@ static void init_amd(struct cpuinfo_x86 *c) case 0x10: init_amd_gh(c); break; case 0x12: init_amd_ln(c); break; case 0x15: init_amd_bd(c); break; + case 0x17: init_amd_zn(c); break; }
/* Enable workaround for FXSAVE leak */
From: Thomas Gleixner tglx@linutronix.de
commit d1035d971829dcf80e8686ccde26f94b0a069472 upstream
Add a ZEN feature bit so family-dependent static_cpu_has() optimizations can be built for ZEN.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 2 ++ arch/x86/kernel/cpu/amd.c | 1 + 2 files changed, 3 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 7dd1bba..8ae9132 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -213,6 +213,8 @@ #define X86_FEATURE_IBRS ( 7*32+25) /* Indirect Branch Restricted Speculation */ #define X86_FEATURE_IBPB ( 7*32+26) /* Indirect Branch Prediction Barrier */ #define X86_FEATURE_STIBP ( 7*32+27) /* Single Thread Indirect Branch Predictors */ +#define X86_FEATURE_ZEN ( 7*32+28) /* "" CPU is AMD family 0x17 (Zen) */ +
/* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 87f4a0d..9f61518 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -715,6 +715,7 @@ static void init_amd_bd(struct cpuinfo_x86 *c)
static void init_amd_zn(struct cpuinfo_x86 *c) { + set_cpu_cap(c, X86_FEATURE_ZEN); /* * Fix erratum 1076: CPB feature bit not being set in CPUID. It affects * all up to and including B1.
From: Thomas Gleixner tglx@linutronix.de
commit 1f50ddb4f4189243c05926b842dc1a0332195f31 upstream
The AMD64_LS_CFG MSR is a per core MSR on Family 17H CPUs. That means when hyperthreading is enabled the SSBD bit toggle needs to take both cores into account. Otherwise the following situation can happen:
CPU0 CPU1
disable SSB disable SSB enable SSB <- Enables it for the Core, i.e. for CPU0 as well
So after the SSB enable on CPU1 the task on CPU0 runs with SSB enabled again.
On Intel the SSBD control is per core as well, but the synchronization logic is implemented behind the per thread SPEC_CTRL MSR. It works like this:
CORE_SPEC_CTRL = THREAD0_SPEC_CTRL | THREAD1_SPEC_CTRL
i.e. if one of the threads enables a mitigation then this affects both and the mitigation is only disabled in the core when both threads disabled it.
Add the necessary synchronization logic for AMD family 17H. Unfortunately that requires a spinlock to serialize the access to the MSR, but the locks are only shared between siblings.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/spec-ctrl.h | 6 ++ arch/x86/kernel/process.c | 125 ++++++++++++++++++++++++++++++++++++-- arch/x86/kernel/smpboot.c | 5 ++ 3 files changed, 130 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h index dc21209..0cb49c4 100644 --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -33,6 +33,12 @@ static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn) return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL; }
+#ifdef CONFIG_SMP +extern void speculative_store_bypass_ht_init(void); +#else +static inline void speculative_store_bypass_ht_init(void) { } +#endif + extern void speculative_store_bypass_update(void);
#endif diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 8cefbd2..0842869 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -199,22 +199,135 @@ static inline void switch_to_bitmap(struct tss_struct *tss, } }
-static __always_inline void __speculative_store_bypass_update(unsigned long tifn) +#ifdef CONFIG_SMP + +struct ssb_state { + struct ssb_state *shared_state; + raw_spinlock_t lock; + unsigned int disable_state; + unsigned long local_state; +}; + +#define LSTATE_SSB 0 + +static DEFINE_PER_CPU(struct ssb_state, ssb_state); + +void speculative_store_bypass_ht_init(void) { - u64 msr; + struct ssb_state *st = this_cpu_ptr(&ssb_state); + unsigned int this_cpu = smp_processor_id(); + unsigned int cpu; + + st->local_state = 0; + + /* + * Shared state setup happens once on the first bringup + * of the CPU. It's not destroyed on CPU hotunplug. + */ + if (st->shared_state) + return; + + raw_spin_lock_init(&st->lock);
- if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) { - msr = x86_amd_ls_cfg_base | ssbd_tif_to_amd_ls_cfg(tifn); + /* + * Go over HT siblings and check whether one of them has set up the + * shared state pointer already. + */ + for_each_cpu(cpu, topology_sibling_cpumask(this_cpu)) { + if (cpu == this_cpu) + continue; + + if (!per_cpu(ssb_state, cpu).shared_state) + continue; + + /* Link it to the state of the sibling: */ + st->shared_state = per_cpu(ssb_state, cpu).shared_state; + return; + } + + /* + * First HT sibling to come up on the core. Link shared state of + * the first HT sibling to itself. The siblings on the same core + * which come up later will see the shared state pointer and link + * themself to the state of this CPU. + */ + st->shared_state = st; +} + +/* + * Logic is: First HT sibling enables SSBD for both siblings in the core + * and last sibling to disable it, disables it for the whole core. This how + * MSR_SPEC_CTRL works in "hardware": + * + * CORE_SPEC_CTRL = THREAD0_SPEC_CTRL | THREAD1_SPEC_CTRL + */ +static __always_inline void amd_set_core_ssb_state(unsigned long tifn) +{ + struct ssb_state *st = this_cpu_ptr(&ssb_state); + u64 msr = x86_amd_ls_cfg_base; + + if (!static_cpu_has(X86_FEATURE_ZEN)) { + msr |= ssbd_tif_to_amd_ls_cfg(tifn); wrmsrl(MSR_AMD64_LS_CFG, msr); + return; + } + + if (tifn & _TIF_SSBD) { + /* + * Since this can race with prctl(), block reentry on the + * same CPU. + */ + if (__test_and_set_bit(LSTATE_SSB, &st->local_state)) + return; + + msr |= x86_amd_ls_cfg_ssbd_mask; + + raw_spin_lock(&st->shared_state->lock); + /* First sibling enables SSBD: */ + if (!st->shared_state->disable_state) + wrmsrl(MSR_AMD64_LS_CFG, msr); + st->shared_state->disable_state++; + raw_spin_unlock(&st->shared_state->lock); } else { - msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn); - wrmsrl(MSR_IA32_SPEC_CTRL, msr); + if (!__test_and_clear_bit(LSTATE_SSB, &st->local_state)) + return; + + raw_spin_lock(&st->shared_state->lock); + st->shared_state->disable_state--; + if (!st->shared_state->disable_state) + wrmsrl(MSR_AMD64_LS_CFG, msr); + raw_spin_unlock(&st->shared_state->lock); } } +#else +static __always_inline void amd_set_core_ssb_state(unsigned long tifn) +{ + u64 msr = x86_amd_ls_cfg_base | ssbd_tif_to_amd_ls_cfg(tifn); + + wrmsrl(MSR_AMD64_LS_CFG, msr); +} +#endif + +static __always_inline void intel_set_ssb_state(unsigned long tifn) +{ + u64 msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn); + + wrmsrl(MSR_IA32_SPEC_CTRL, msr); +} + +static __always_inline void __speculative_store_bypass_update(unsigned long tifn) +{ + if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) + amd_set_core_ssb_state(tifn); + else + intel_set_ssb_state(tifn); +}
void speculative_store_bypass_update(void) { + preempt_disable(); __speculative_store_bypass_update(current_thread_info()->flags); + preempt_enable(); }
void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 1f7aefc..c017f1c 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -75,6 +75,7 @@ #include <asm/i8259.h> #include <asm/realmode.h> #include <asm/misc.h> +#include <asm/spec-ctrl.h>
/* Number of siblings per CPU package */ int smp_num_siblings = 1; @@ -217,6 +218,8 @@ static void notrace start_secondary(void *unused) */ check_tsc_sync_target();
+ speculative_store_bypass_ht_init(); + /* * Lock vector_lock and initialize the vectors on this cpu * before setting the cpu online. We must set it online with @@ -1209,6 +1212,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) set_mtrr_aps_delayed_init();
smp_quirk_init_udelay(); + + speculative_store_bypass_ht_init(); }
void arch_enable_nonboot_cpus_begin(void)
From: Thomas Gleixner tglx@linutronix.de
commit ccbcd2674472a978b48c91c1fbfb66c0ff959f24 upstream
AMD is proposing a VIRT_SPEC_CTRL MSR to handle the Speculative Store Bypass Disable via MSR_AMD64_LS_CFG so that guests do not have to care about the bit position of the SSBD bit and thus facilitate migration. Also, the sibling coordination on Family 17H CPUs can only be done on the host.
Extend x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() with an extra argument for the VIRT_SPEC_CTRL MSR.
Hand in 0 from VMX and in SVM add a new virt_spec_ctrl member to the CPU data structure which is going to be used in later patches for the actual implementation.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [ Srivatsa: Backported to 4.4.y, skipping the KVM changes in this patch. ] Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/spec-ctrl.h | 9 ++++++--- arch/x86/kernel/cpu/bugs.c | 20 ++++++++++++++++++-- 2 files changed, 24 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h index 0cb49c4..6e28740 100644 --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -10,10 +10,13 @@ * the guest has, while on VMEXIT we restore the host view. This * would be easier if SPEC_CTRL were architecturally maskable or * shadowable for guests but this is not (currently) the case. - * Takes the guest view of SPEC_CTRL MSR as a parameter. + * Takes the guest view of SPEC_CTRL MSR as a parameter and also + * the guest's version of VIRT_SPEC_CTRL, if emulated. */ -extern void x86_spec_ctrl_set_guest(u64); -extern void x86_spec_ctrl_restore_host(u64); +extern void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, + u64 guest_virt_spec_ctrl); +extern void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, + u64 guest_virt_spec_ctrl);
/* AMD specific Speculative Store Bypass MSR data */ extern u64 x86_amd_ls_cfg_base; diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 9be7292..a1c98fd 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -149,7 +149,15 @@ u64 x86_spec_ctrl_get_default(void) } EXPORT_SYMBOL_GPL(x86_spec_ctrl_get_default);
-void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl) +/** + * x86_spec_ctrl_set_guest - Set speculation control registers for the guest + * @guest_spec_ctrl: The guest content of MSR_SPEC_CTRL + * @guest_virt_spec_ctrl: The guest controlled bits of MSR_VIRT_SPEC_CTRL + * (may get translated to MSR_AMD64_LS_CFG bits) + * + * Avoids writing to the MSR if the content/bits are the same + */ +void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl) { u64 host = x86_spec_ctrl_base;
@@ -166,7 +174,15 @@ void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl) } EXPORT_SYMBOL_GPL(x86_spec_ctrl_set_guest);
-void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl) +/** + * x86_spec_ctrl_restore_host - Restore host speculation control registers + * @guest_spec_ctrl: The guest content of MSR_SPEC_CTRL + * @guest_virt_spec_ctrl: The guest controlled bits of MSR_VIRT_SPEC_CTRL + * (may get translated to MSR_AMD64_LS_CFG bits) + * + * Avoids writing to the MSR if the content/bits are the same + */ +void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl) { u64 host = x86_spec_ctrl_base;
From: Tom Lendacky thomas.lendacky@amd.com
commit 11fb0683493b2da112cd64c9dada221b52463bf7 upstream
Some AMD processors only support a non-architectural means of enabling speculative store bypass disable (SSBD). To allow a simplified view of this to a guest, an architectural definition has been created through a new CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f. With this, a hypervisor can virtualize the existence of this definition and provide an architectural method for using SSBD to a guest.
Add the new CPUID feature, the new MSR and update the existing SSBD support to use this MSR when present.
Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 2 ++ arch/x86/kernel/cpu/bugs.c | 4 +++- arch/x86/kernel/process.c | 13 ++++++++++++- 4 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 8ae9132..f4b175d 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -269,6 +269,7 @@ #define X86_FEATURE_AMD_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */ #define X86_FEATURE_AMD_IBRS (13*32+14) /* Indirect Branch Restricted Speculation */ #define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors */ +#define X86_FEATURE_VIRT_SSBD (13*32+25) /* Virtualized Speculative Store Bypass Disable */
/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */ #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 2ea2ff1..22f2dd5 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -328,6 +328,8 @@ #define MSR_AMD64_IBSOPDATA4 0xc001103d #define MSR_AMD64_IBS_REG_COUNT_MAX 8 /* includes MSR_AMD64_IBSBRTARGET */
+#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f + /* Fam 16h MSRs */ #define MSR_F16H_L2I_PERF_CTL 0xc0010230 #define MSR_F16H_L2I_PERF_CTR 0xc0010231 diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index a1c98fd..50ab206a 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -203,7 +203,9 @@ static void x86_amd_ssb_disable(void) { u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_ssbd_mask;
- if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD)) + if (boot_cpu_has(X86_FEATURE_VIRT_SSBD)) + wrmsrl(MSR_AMD64_VIRT_SPEC_CTRL, SPEC_CTRL_SSBD); + else if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD)) wrmsrl(MSR_AMD64_LS_CFG, msrval); }
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 0842869..eab9d0c 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -308,6 +308,15 @@ static __always_inline void amd_set_core_ssb_state(unsigned long tifn) } #endif
+static __always_inline void amd_set_ssb_virt_state(unsigned long tifn) +{ + /* + * SSBD has the same definition in SPEC_CTRL and VIRT_SPEC_CTRL, + * so ssbd_tif_to_spec_ctrl() just works. + */ + wrmsrl(MSR_AMD64_VIRT_SPEC_CTRL, ssbd_tif_to_spec_ctrl(tifn)); +} + static __always_inline void intel_set_ssb_state(unsigned long tifn) { u64 msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn); @@ -317,7 +326,9 @@ static __always_inline void intel_set_ssb_state(unsigned long tifn)
static __always_inline void __speculative_store_bypass_update(unsigned long tifn) { - if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) + if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) + amd_set_ssb_virt_state(tifn); + else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) amd_set_core_ssb_state(tifn); else intel_set_ssb_state(tifn);
From: Thomas Gleixner tglx@linutronix.de
commit 0270be3e34efb05a88bc4c422572ece038ef3608 upstream
The upcoming support for the virtual SPEC_CTRL MSR on AMD needs to reuse speculative_store_bypass_update() to avoid code duplication. Add an argument for supplying a thread info (TIF) value and create a wrapper speculative_store_bypass_update_current() which is used at the existing call site.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/spec-ctrl.h | 7 ++++++- arch/x86/kernel/cpu/bugs.c | 2 +- arch/x86/kernel/process.c | 4 ++-- 3 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h index 6e28740..82b6c5a 100644 --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -42,6 +42,11 @@ extern void speculative_store_bypass_ht_init(void); static inline void speculative_store_bypass_ht_init(void) { } #endif
-extern void speculative_store_bypass_update(void); +extern void speculative_store_bypass_update(unsigned long tif); + +static inline void speculative_store_bypass_update_current(void) +{ + speculative_store_bypass_update(current_thread_info()->flags); +}
#endif diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 50ab206a..1b29be9 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -596,7 +596,7 @@ static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) * mitigation until it is next scheduled. */ if (task == current && update) - speculative_store_bypass_update(); + speculative_store_bypass_update_current();
return 0; } diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index eab9d0c..e18c879 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -334,10 +334,10 @@ static __always_inline void __speculative_store_bypass_update(unsigned long tifn intel_set_ssb_state(tifn); }
-void speculative_store_bypass_update(void) +void speculative_store_bypass_update(unsigned long tif) { preempt_disable(); - __speculative_store_bypass_update(current_thread_info()->flags); + __speculative_store_bypass_update(tif); preempt_enable(); }
From: Borislav Petkov bp@suse.de
commit cc69b34989210f067b2c51d5539b5f96ebcc3a01 upstream
Function bodies are very similar and are going to grow more almost identical code. Add a bool arg to determine whether SPEC_CTRL is being set for the guest or restored to the host.
No functional changes.
Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/spec-ctrl.h | 33 ++++++++++++++++++--- arch/x86/kernel/cpu/bugs.c | 60 ++++++++++---------------------------- 2 files changed, 44 insertions(+), 49 deletions(-)
diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h index 82b6c5a..9cecbe5 100644 --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -13,10 +13,35 @@ * Takes the guest view of SPEC_CTRL MSR as a parameter and also * the guest's version of VIRT_SPEC_CTRL, if emulated. */ -extern void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, - u64 guest_virt_spec_ctrl); -extern void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, - u64 guest_virt_spec_ctrl); +extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool guest); + +/** + * x86_spec_ctrl_set_guest - Set speculation control registers for the guest + * @guest_spec_ctrl: The guest content of MSR_SPEC_CTRL + * @guest_virt_spec_ctrl: The guest controlled bits of MSR_VIRT_SPEC_CTRL + * (may get translated to MSR_AMD64_LS_CFG bits) + * + * Avoids writing to the MSR if the content/bits are the same + */ +static inline +void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl) +{ + x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, true); +} + +/** + * x86_spec_ctrl_restore_host - Restore host speculation control registers + * @guest_spec_ctrl: The guest content of MSR_SPEC_CTRL + * @guest_virt_spec_ctrl: The guest controlled bits of MSR_VIRT_SPEC_CTRL + * (may get translated to MSR_AMD64_LS_CFG bits) + * + * Avoids writing to the MSR if the content/bits are the same + */ +static inline +void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl) +{ + x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, false); +}
/* AMD specific Speculative Store Bypass MSR data */ extern u64 x86_amd_ls_cfg_base; diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 1b29be9..208d44c 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -149,55 +149,25 @@ u64 x86_spec_ctrl_get_default(void) } EXPORT_SYMBOL_GPL(x86_spec_ctrl_get_default);
-/** - * x86_spec_ctrl_set_guest - Set speculation control registers for the guest - * @guest_spec_ctrl: The guest content of MSR_SPEC_CTRL - * @guest_virt_spec_ctrl: The guest controlled bits of MSR_VIRT_SPEC_CTRL - * (may get translated to MSR_AMD64_LS_CFG bits) - * - * Avoids writing to the MSR if the content/bits are the same - */ -void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl) +void +x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) { - u64 host = x86_spec_ctrl_base; + struct thread_info *ti = current_thread_info(); + u64 msr, host = x86_spec_ctrl_base;
/* Is MSR_SPEC_CTRL implemented ? */ - if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) - return; - - /* SSBD controlled in MSR_SPEC_CTRL */ - if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD)) - host |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags); - - if (host != guest_spec_ctrl) - wrmsrl(MSR_IA32_SPEC_CTRL, guest_spec_ctrl); -} -EXPORT_SYMBOL_GPL(x86_spec_ctrl_set_guest); - -/** - * x86_spec_ctrl_restore_host - Restore host speculation control registers - * @guest_spec_ctrl: The guest content of MSR_SPEC_CTRL - * @guest_virt_spec_ctrl: The guest controlled bits of MSR_VIRT_SPEC_CTRL - * (may get translated to MSR_AMD64_LS_CFG bits) - * - * Avoids writing to the MSR if the content/bits are the same - */ -void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl) -{ - u64 host = x86_spec_ctrl_base; - - /* Is MSR_SPEC_CTRL implemented ? */ - if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) - return; - - /* SSBD controlled in MSR_SPEC_CTRL */ - if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD)) - host |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags); - - if (host != guest_spec_ctrl) - wrmsrl(MSR_IA32_SPEC_CTRL, host); + if (static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) { + /* SSBD controlled in MSR_SPEC_CTRL */ + if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD)) + host |= ssbd_tif_to_spec_ctrl(ti->flags); + + if (host != guest_spec_ctrl) { + msr = setguest ? guest_spec_ctrl : host; + wrmsrl(MSR_IA32_SPEC_CTRL, msr); + } + } } -EXPORT_SYMBOL_GPL(x86_spec_ctrl_restore_host); +EXPORT_SYMBOL_GPL(x86_virt_spec_ctrl);
static void x86_amd_ssb_disable(void) {
From: Thomas Gleixner tglx@linutronix.de
commit fa8ac4988249c38476f6ad678a4848a736373403 upstream
x86_spec_ctrl_base is the system wide default value for the SPEC_CTRL MSR. x86_spec_ctrl_get_default() returns x86_spec_ctrl_base and was intended to prevent modification to that variable. Though the variable is read only after init and globaly visible already.
Remove the function and export the variable instead.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/nospec-branch.h | 16 +++++----------- arch/x86/include/asm/spec-ctrl.h | 3 --- arch/x86/kernel/cpu/bugs.c | 11 +---------- 3 files changed, 6 insertions(+), 24 deletions(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 640c11b..2757c79 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -172,16 +172,7 @@ enum spectre_v2_mitigation { SPECTRE_V2_IBRS, };
-/* - * The Intel specification for the SPEC_CTRL MSR requires that we - * preserve any already set reserved bits at boot time (e.g. for - * future additions that this kernel is not currently aware of). - * We then set any additional mitigation bits that we want - * ourselves and always use this as the base for SPEC_CTRL. - * We also use this when handling guest entry/exit as below. - */ extern void x86_spec_ctrl_set(u64); -extern u64 x86_spec_ctrl_get_default(void);
/* The Speculative Store Bypass disable variants */ enum ssb_mitigation { @@ -232,6 +223,9 @@ static inline void indirect_branch_prediction_barrier(void) alternative_msr_write(MSR_IA32_PRED_CMD, val, X86_FEATURE_USE_IBPB); }
+/* The Intel SPEC CTRL MSR base value cache */ +extern u64 x86_spec_ctrl_base; + /* * With retpoline, we must use IBRS to restrict branch prediction * before calling into firmware. @@ -240,7 +234,7 @@ static inline void indirect_branch_prediction_barrier(void) */ #define firmware_restrict_branch_speculation_start() \ do { \ - u64 val = x86_spec_ctrl_get_default() | SPEC_CTRL_IBRS; \ + u64 val = x86_spec_ctrl_base | SPEC_CTRL_IBRS; \ \ preempt_disable(); \ alternative_msr_write(MSR_IA32_SPEC_CTRL, val, \ @@ -249,7 +243,7 @@ do { \
#define firmware_restrict_branch_speculation_end() \ do { \ - u64 val = x86_spec_ctrl_get_default(); \ + u64 val = x86_spec_ctrl_base; \ \ alternative_msr_write(MSR_IA32_SPEC_CTRL, val, \ X86_FEATURE_USE_IBRS_FW); \ diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h index 9cecbe5..763d497 100644 --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -47,9 +47,6 @@ void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl) extern u64 x86_amd_ls_cfg_base; extern u64 x86_amd_ls_cfg_ssbd_mask;
-/* The Intel SPEC CTRL MSR base value cache */ -extern u64 x86_spec_ctrl_base; - static inline u64 ssbd_tif_to_spec_ctrl(u64 tifn) { BUILD_BUG_ON(TIF_SSBD < SPEC_CTRL_SSBD_SHIFT); diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 208d44c..5391df5 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -35,6 +35,7 @@ static void __init ssb_select_mitigation(void); * writes to SPEC_CTRL contain whatever reserved bits have been set. */ u64 x86_spec_ctrl_base; +EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
/* * The vendor and possibly platform specific bits which can be modified in @@ -139,16 +140,6 @@ void x86_spec_ctrl_set(u64 val) } EXPORT_SYMBOL_GPL(x86_spec_ctrl_set);
-u64 x86_spec_ctrl_get_default(void) -{ - u64 msrval = x86_spec_ctrl_base; - - if (static_cpu_has(X86_FEATURE_SPEC_CTRL)) - msrval |= ssbd_tif_to_spec_ctrl(current_thread_info()->flags); - return msrval; -} -EXPORT_SYMBOL_GPL(x86_spec_ctrl_get_default); - void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) {
From: Thomas Gleixner tglx@linutronix.de
commit 4b59bdb569453a60b752b274ca61f009e37f4dae upstream
x86_spec_ctrl_set() is only used in bugs.c and the extra mask checks there provide no real value as both call sites can just write x86_spec_ctrl_base to MSR_SPEC_CTRL. x86_spec_ctrl_base is valid and does not need any extra masking or checking.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/nospec-branch.h | 2 -- arch/x86/kernel/cpu/bugs.c | 13 ++----------- 2 files changed, 2 insertions(+), 13 deletions(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 2757c79..b4c74c2 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -172,8 +172,6 @@ enum spectre_v2_mitigation { SPECTRE_V2_IBRS, };
-extern void x86_spec_ctrl_set(u64); - /* The Speculative Store Bypass disable variants */ enum ssb_mitigation { SPEC_STORE_BYPASS_NONE, diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 5391df5..05eed68 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -131,15 +131,6 @@ static const char *spectre_v2_strings[] = {
static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
-void x86_spec_ctrl_set(u64 val) -{ - if (val & x86_spec_ctrl_mask) - WARN_ONCE(1, "SPEC_CTRL MSR value 0x%16llx is unknown.\n", val); - else - wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base | val); -} -EXPORT_SYMBOL_GPL(x86_spec_ctrl_set); - void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) { @@ -501,7 +492,7 @@ static enum ssb_mitigation __init __ssb_select_mitigation(void) case X86_VENDOR_INTEL: x86_spec_ctrl_base |= SPEC_CTRL_SSBD; x86_spec_ctrl_mask &= ~SPEC_CTRL_SSBD; - x86_spec_ctrl_set(SPEC_CTRL_SSBD); + wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); break; case X86_VENDOR_AMD: x86_amd_ssb_disable(); @@ -613,7 +604,7 @@ int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which) void x86_spec_ctrl_setup_ap(void) { if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) - x86_spec_ctrl_set(x86_spec_ctrl_base & ~x86_spec_ctrl_mask); + wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
if (ssb_mode == SPEC_STORE_BYPASS_DISABLE) x86_amd_ssb_disable();
From: Thomas Gleixner tglx@linutronix.de
commit be6fcb5478e95bb1c91f489121238deb3abca46a upstream
x86_spec_ctrL_mask is intended to mask out bits from a MSR_SPEC_CTRL value which are not to be modified. However the implementation is not really used and the bitmask was inverted to make a check easier, which was removed in "x86/bugs: Remove x86_spec_ctrl_set()"
Aside of that it is missing the STIBP bit if it is supported by the platform, so if the mask would be used in x86_virt_spec_ctrl() then it would prevent a guest from setting STIBP.
Add the STIBP bit if supported and use the mask in x86_virt_spec_ctrl() to sanitize the value which is supplied by the guest.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@suse.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/kernel/cpu/bugs.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 05eed68..af11a02 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -41,7 +41,7 @@ EXPORT_SYMBOL_GPL(x86_spec_ctrl_base); * The vendor and possibly platform specific bits which can be modified in * x86_spec_ctrl_base. */ -static u64 x86_spec_ctrl_mask = ~SPEC_CTRL_IBRS; +static u64 x86_spec_ctrl_mask = SPEC_CTRL_IBRS;
/* * AMD specific MSR info for Speculative Store Bypass control. @@ -67,6 +67,10 @@ void __init check_bugs(void) if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+ /* Allow STIBP in MSR_SPEC_CTRL if supported */ + if (boot_cpu_has(X86_FEATURE_STIBP)) + x86_spec_ctrl_mask |= SPEC_CTRL_STIBP; + /* Select the proper spectre mitigation before patching alternatives */ spectre_v2_select_mitigation();
@@ -134,18 +138,26 @@ static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) { + u64 msrval, guestval, hostval = x86_spec_ctrl_base; struct thread_info *ti = current_thread_info(); - u64 msr, host = x86_spec_ctrl_base;
/* Is MSR_SPEC_CTRL implemented ? */ if (static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) { + /* + * Restrict guest_spec_ctrl to supported values. Clear the + * modifiable bits in the host base value and or the + * modifiable bits from the guest value. + */ + guestval = hostval & ~x86_spec_ctrl_mask; + guestval |= guest_spec_ctrl & x86_spec_ctrl_mask; + /* SSBD controlled in MSR_SPEC_CTRL */ if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD)) - host |= ssbd_tif_to_spec_ctrl(ti->flags); + hostval |= ssbd_tif_to_spec_ctrl(ti->flags);
- if (host != guest_spec_ctrl) { - msr = setguest ? guest_spec_ctrl : host; - wrmsrl(MSR_IA32_SPEC_CTRL, msr); + if (hostval != guestval) { + msrval = setguest ? guestval : hostval; + wrmsrl(MSR_IA32_SPEC_CTRL, msrval); } } } @@ -491,7 +503,7 @@ static enum ssb_mitigation __init __ssb_select_mitigation(void) switch (boot_cpu_data.x86_vendor) { case X86_VENDOR_INTEL: x86_spec_ctrl_base |= SPEC_CTRL_SSBD; - x86_spec_ctrl_mask &= ~SPEC_CTRL_SSBD; + x86_spec_ctrl_mask |= SPEC_CTRL_SSBD; wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); break; case X86_VENDOR_AMD:
From: Thomas Gleixner tglx@linutronix.de
commit 47c61b3955cf712cadfc25635bf9bc174af030ea upstream
Add the necessary logic for supporting the emulated VIRT_SPEC_CTRL MSR to x86_virt_spec_ctrl(). If either X86_FEATURE_LS_CFG_SSBD or X86_FEATURE_VIRT_SPEC_CTRL is set then use the new guest_virt_spec_ctrl argument to check whether the state must be modified on the host. The update reuses speculative_store_bypass_update() so the ZEN-specific sibling coordination can be reused.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/spec-ctrl.h | 6 ++++++ arch/x86/kernel/cpu/bugs.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+)
diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h index 763d497..ae7c2c5 100644 --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -53,6 +53,12 @@ static inline u64 ssbd_tif_to_spec_ctrl(u64 tifn) return (tifn & _TIF_SSBD) >> (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT); }
+static inline unsigned long ssbd_spec_ctrl_to_tif(u64 spec_ctrl) +{ + BUILD_BUG_ON(TIF_SSBD < SPEC_CTRL_SSBD_SHIFT); + return (spec_ctrl & SPEC_CTRL_SSBD) << (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT); +} + static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn) { return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL; diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index af11a02..12a8867 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -160,6 +160,36 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) wrmsrl(MSR_IA32_SPEC_CTRL, msrval); } } + + /* + * If SSBD is not handled in MSR_SPEC_CTRL on AMD, update + * MSR_AMD64_L2_CFG or MSR_VIRT_SPEC_CTRL if supported. + */ + if (!static_cpu_has(X86_FEATURE_LS_CFG_SSBD) && + !static_cpu_has(X86_FEATURE_VIRT_SSBD)) + return; + + /* + * If the host has SSBD mitigation enabled, force it in the host's + * virtual MSR value. If its not permanently enabled, evaluate + * current's TIF_SSBD thread flag. + */ + if (static_cpu_has(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE)) + hostval = SPEC_CTRL_SSBD; + else + hostval = ssbd_tif_to_spec_ctrl(ti->flags); + + /* Sanitize the guest value */ + guestval = guest_virt_spec_ctrl & SPEC_CTRL_SSBD; + + if (hostval != guestval) { + unsigned long tif; + + tif = setguest ? ssbd_spec_ctrl_to_tif(guestval) : + ssbd_spec_ctrl_to_tif(hostval); + + speculative_store_bypass_update(tif); + } } EXPORT_SYMBOL_GPL(x86_virt_spec_ctrl);
From: Konrad Rzeszutek Wilk konrad.wilk@oracle.com
commit 240da953fcc6a9008c92fae5b1f727ee5ed167ab upstream
The "336996 Speculative Execution Side Channel Mitigations" from May defines this as SSB_NO, hence lets sync-up.
Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Srivatsa S. Bhat srivatsa@csail.mit.edu Reviewed-by: Matt Helsley (VMware) matt.helsley@gmail.com Reviewed-by: Alexey Makhalov amakhalov@vmware.com Reviewed-by: Bo Gan ganb@vmware.com ---
arch/x86/include/asm/msr-index.h | 2 +- arch/x86/kernel/cpu/common.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 22f2dd5..caa0019 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -58,7 +58,7 @@ #define MSR_IA32_ARCH_CAPABILITIES 0x0000010a #define ARCH_CAP_RDCL_NO (1 << 0) /* Not susceptible to Meltdown */ #define ARCH_CAP_IBRS_ALL (1 << 1) /* Enhanced IBRS support */ -#define ARCH_CAP_SSBD_NO (1 << 4) /* +#define ARCH_CAP_SSB_NO (1 << 4) /* * Not susceptible to Speculative Store Bypass * attack, so no Speculative Store Bypass * control required. diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 1097723..9ad38ad 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -881,7 +881,7 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
if (!x86_match_cpu(cpu_no_spec_store_bypass) && - !(ia32_cap & ARCH_CAP_SSBD_NO)) + !(ia32_cap & ARCH_CAP_SSB_NO)) setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
if (x86_match_cpu(cpu_no_speculation))
Hi Srivatsa,
just a small personal comment below since I'm not the 4.4 maintainer :-)
On Fri, Jul 13, 2018 at 02:49:14PM -0700, Srivatsa S. Bhat wrote:
You'll notice that the initial few patches in this series include cleanups etc., that are non-critical to IBPB/IBRS/SSBD. Most of these patches are aimed at getting the cpufeature.h vs cpufeatures.h split into 4.4, since a lot of the subsequent patches update these headers. On my first attempt to backport these patches to 4.4.y, I had actually tried to do all the updates on the cpufeature.h file itself, but it started getting very cumbersome, so I resorted to backporting the cpufeature.h vs cpufeatures.h split and their dependencies as well.
When I used to maintain 2.6.32, initially I tried to backport the minimal amount of changes. It quickly resulted in certain files (mostly includes) diverging a lot from more recent versions, making it a real pain to later backport other fixes. Worse, sometimes I would even not be 100% confident in my backports (eg: macros being declared very differently or at multiple places which I could easily miss). For 3.10, I stopped doing that and instead preferred to pick a bit more "noise" to prepare and adapt the files to be patched so that overall the code looked more similar to more recent versions. I found that not only it helped with backports, but it also reduced the amount of breakage introduced for later backports, and it helped with reviews, since what matters is not that the old kernel is correct but that the recent one is, and if the code is the same then the old one is correct.
Thus based on my experiences with two different methods I guess that I had the best approach here.
I would appreciate if you could kindly consider these patches for review and inclusion in a future 4.4.y release.
However here I notice that your patches were not CCed to their respective authors nor to the list of participants, I predict that Greg will ask you for this as they're almost the only ones understanding their own tricks ;-)
Cheers, Willy
On Sat, Jul 14, 2018 at 07:27:55AM +0200, Willy Tarreau wrote:
Hi Srivatsa,
just a small personal comment below since I'm not the 4.4 maintainer :-)
On Fri, Jul 13, 2018 at 02:49:14PM -0700, Srivatsa S. Bhat wrote:
You'll notice that the initial few patches in this series include cleanups etc., that are non-critical to IBPB/IBRS/SSBD. Most of these patches are aimed at getting the cpufeature.h vs cpufeatures.h split into 4.4, since a lot of the subsequent patches update these headers. On my first attempt to backport these patches to 4.4.y, I had actually tried to do all the updates on the cpufeature.h file itself, but it started getting very cumbersome, so I resorted to backporting the cpufeature.h vs cpufeatures.h split and their dependencies as well.
When I used to maintain 2.6.32, initially I tried to backport the minimal amount of changes. It quickly resulted in certain files (mostly includes) diverging a lot from more recent versions, making it a real pain to later backport other fixes. Worse, sometimes I would even not be 100% confident in my backports (eg: macros being declared very differently or at multiple places which I could easily miss). For 3.10, I stopped doing that and instead preferred to pick a bit more "noise" to prepare and adapt the files to be patched so that overall the code looked more similar to more recent versions. I found that not only it helped with backports, but it also reduced the amount of breakage introduced for later backports, and it helped with reviews, since what matters is not that the old kernel is correct but that the recent one is, and if the code is the same then the old one is correct.
Thus based on my experiences with two different methods I guess that I had the best approach here.
Yes, taking the "noise" has been found to be better over time. But, for these patches, I haven't taken that change to 4.9 just yet, and it is causing a headache for people. Perhaps we should do it there too? I don't want 4.9 to be the odd-ball kernel where we didn't do it, that will just cause more problems over time.
I would appreciate if you could kindly consider these patches for review and inclusion in a future 4.4.y release.
However here I notice that your patches were not CCed to their respective authors nor to the list of participants, I predict that Greg will ask you for this as they're almost the only ones understanding their own tricks ;-)
Ah, you beat me to it :)
Yes, you should resend and cc: all of the authors and reviewers, I want their feedback if at all possible.
Normally it's hard to not do that using git send-email, so you had to do extra work to prevent this from happening...
thanks,
greg k-h
On Sat, Jul 14, 2018 at 07:48:37AM +0200, Greg KH wrote:
But, for these patches, I haven't taken that change to 4.9 just yet, and it is causing a headache for people. Perhaps we should do it there too? I don't want 4.9 to be the odd-ball kernel where we didn't do it, that will just cause more problems over time.
Definitely. I'm still heavily relying on 4.9, and even if I don't enable these features, I'm willing to provide some help on this as time permits, so that at the very least we can reduce the risk of breakage of 4.9 over time. So do not hesitate to ping me if you want me to test some patchsets or help with backporting certain series.
Willy
On 7/13/18 10:48 PM, Greg KH wrote:
On Sat, Jul 14, 2018 at 07:27:55AM +0200, Willy Tarreau wrote:
Hi Srivatsa,
just a small personal comment below since I'm not the 4.4 maintainer :-)
On Fri, Jul 13, 2018 at 02:49:14PM -0700, Srivatsa S. Bhat wrote:
You'll notice that the initial few patches in this series include cleanups etc., that are non-critical to IBPB/IBRS/SSBD. Most of these patches are aimed at getting the cpufeature.h vs cpufeatures.h split into 4.4, since a lot of the subsequent patches update these headers. On my first attempt to backport these patches to 4.4.y, I had actually tried to do all the updates on the cpufeature.h file itself, but it started getting very cumbersome, so I resorted to backporting the cpufeature.h vs cpufeatures.h split and their dependencies as well.
When I used to maintain 2.6.32, initially I tried to backport the minimal amount of changes. It quickly resulted in certain files (mostly includes) diverging a lot from more recent versions, making it a real pain to later backport other fixes. Worse, sometimes I would even not be 100% confident in my backports (eg: macros being declared very differently or at multiple places which I could easily miss). For 3.10, I stopped doing that and instead preferred to pick a bit more "noise" to prepare and adapt the files to be patched so that overall the code looked more similar to more recent versions. I found that not only it helped with backports, but it also reduced the amount of breakage introduced for later backports, and it helped with reviews, since what matters is not that the old kernel is correct but that the recent one is, and if the code is the same then the old one is correct.
Thus based on my experiences with two different methods I guess that I had the best approach here.
Thank you very much Willy for your feedback and for confirming that backporting a bit more than the minimal changes (when it helps) is the better approach in the long run!
Yes, taking the "noise" has been found to be better over time. But, for these patches, I haven't taken that change to 4.9 just yet, and it is causing a headache for people. Perhaps we should do it there too? I don't want 4.9 to be the odd-ball kernel where we didn't do it, that will just cause more problems over time.
I would appreciate if you could kindly consider these patches for review and inclusion in a future 4.4.y release.
However here I notice that your patches were not CCed to their respective authors nor to the list of participants, I predict that Greg will ask you for this as they're almost the only ones understanding their own tricks ;-)
Ah, you beat me to it :)
Yes, you should resend and cc: all of the authors and reviewers, I want their feedback if at all possible.
Sure, will do!
Normally it's hard to not do that using git send-email, so you had to do extra work to prevent this from happening...
Well, I used stgit to send these patches, which requires that I specify '--auto' (which I had skipped earlier) in order to CC the patches to the patch signers.
Regards, Srivatsa VMware Photon OS
linux-stable-mirror@lists.linaro.org