Fix in this version bugs causing build problems for UP configuration.
Also merged in Jiri's change to extend STIBP for SECCOMP processes and renaming TIF_STIBP to TIF_SPEC_INDIR_BRANCH.
I've updated the boot options spectre_v2_app2app to on, off, auto, prctl and seccomp. This aligns with the options for other speculation related mitigations.
I tried to incorporate sched_smt_present to detect when we have all SMT going offline and we can disable the SMT path, which Peter suggested. This optimization that can be easily left out of the patch series and not backported. I've put these two patches at the end and they can be considered separately.
I've dropped the TIF flags re-organization patches as they are not needed in this patch series.
To do: Create a dedicated document on the mitigation options for Spectre V2.
Since Jiri's patchset to always turn on STIBP has big performance impact, I think that it should be reverted from 4.20 and stable kernels for now, till this patchset to mitigate its performance impact can be merged with it into the mainline and backported to stable kernels.
Thanks.
Tim
Patch 1 to 3 are clean up patches. Patch 4 and 5 disable STIBP for enhacned IBRS. Patch 6 to 9 reorganize and clean up the code without affecting functionality for easier modification later. Patch 10 introduces the STIBP flag on a task to dynamically enable STIBP for that task. Patch 11 introduces different modes to protect a task against Spectre v2 user space attack. Patch 12 adds prctl interface to turn on Spectre v2 user mode defenses on a task. Patch 13 Put IBPB usage under the mode chosen for app2app mitigation. Patch 14 Add STIBP protection for SECCOMP tasks. Patch 15-16 add Spectre v2 defenses for non-dumpable tasks. Patch 15-16 reorganizes the TIF flags, and can be dropped without affecting this series Patch 17-18 When there are no paired SMT left, disable SMT specific code
Changes: v6: 1. Fix bugs for UP build configuration. 2. Add protection for SECCOMP tasks. 3. Rename TIF_STIBP to TIF_SPEC_INDIR_BRANCH. 4. Update boot options to align with other speculation mitigations. 5. Separate out IBPB change that makes it depend on TIF_SPEC_INDIR_BRANCH. 6. Move some checks for SPEC_CTRL updates to spec_ctrl_update_msr to avoid unnecesseary MSR writes. 7. Drop TIF reorg patches. 8. Incorporate optimization to disable SMT code paths when no paired SMT is present.
v5: 1. Drop patch to extend TIF_STIBP changes to all related threads on a task's dumpabibility change. 2. Drop patch to replace sched_smt_present with cpu_smt_enabled. 3. Drop export of cpu_smt_control in kernel/cpu.c and replace external usages of cpu_smt_control with cpu_smt_enabled. 4. Rebase patch series on 4.20-rc2.
v4: 1. Extend STIBP update to all threads of a process changing it dumpability. 2. Add logic to update SPEC_CTRL MSR on a remote CPU when TIF flags affecting speculation changes for task running on the remote CPU. 3. Regroup x86 TIF_* flags according to their functions. 4. Various code clean up.
v3: 1. Add logic to skip STIBP when Enhanced IBRS is used. 2. Break up v2 patches into smaller logical patches. 3. Fix bug in arch_set_dumpable that did not update SPEC_CTRL MSR right away when according to task's STIBP flag clearing which caused SITBP to be left on. 4. Various code clean up.
v2: 1. Extend per process STIBP to AMD cpus 2. Add prctl option to control per process indirect branch speculation 3. Bug fixes and cleanups
Jiri's patchset to harden Spectre v2 user space mitigation makes IBPB and STIBP in use for Spectre v2 mitigation on all processes. IBPB will be issued for switching to an application that's not ptraceable by the previous application and STIBP will be always turned on.
However, leaving STIBP on all the time is expensive for certain applications that have frequent indirect branches. One such application is perlbench in the SpecInt Rate 2006 test suite which shows a 21% reduction in throughput. There're also reports of drop in performance on Python and PHP benchmarks: https://www.phoronix.com/scan.php?page=article&item=linux-420-bisect&...
Other applications like bzip2 with minimal indirct branches have only a 0.7% reduction in throughput. IBPB will also impose overhead during context switches.
Users may not wish to incur performance overhead from IBPB and STIBP for general non security sensitive processes and use these mitigations only for security sensitive processes.
This patchset provides a process property based lite protection mode. In this mode, IBPB and STIBP mitigation are applied only to security sensitive non-dumpable processes and processes that users want to protect by having indirect branch speculation disabled via PRCTL. So the overhead from IBPB and STIBP are avoided for low security processes that don't require extra protection.
Jiri Kosina (1): x86/speculation: Add 'seccomp' Spectre v2 app to app protection mode
Peter Zijlstra (1): sched/smt: Make sched_smt_present track topology
Tim Chen (14): x86/speculation: Reorganize cpu_show_common() x86/speculation: Add X86_FEATURE_USE_IBRS_ENHANCED x86/speculation: Disable STIBP when enhanced IBRS is in use x86/speculation: Rename SSBD update functions x86/speculation: Reorganize speculation control MSRs update smt: Create cpu_smt_enabled static key for SMT specific code x86/smt: Convert cpu_smt_control check to cpu_smt_enabled static key x86/speculation: Turn on or off STIBP according to a task's TIF_STIBP x86/speculation: Add Spectre v2 app to app protection modes x86/speculation: Create PRCTL interface to restrict indirect branch speculation x86/speculation: Enable IBPB for tasks with TIF_SPEC_BRANCH_SPECULATION security: Update speculation restriction of a process when modifying its dumpability x86/speculation: Use STIBP to restrict speculation on non-dumpable task x86/smt: Allow disabling of SMT when last SMT is offlined
Documentation/admin-guide/kernel-parameters.txt | 34 +++ Documentation/userspace-api/spec_ctrl.rst | 9 + arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 6 +- arch/x86/include/asm/nospec-branch.h | 10 + arch/x86/include/asm/spec-ctrl.h | 18 +- arch/x86/include/asm/thread_info.h | 5 +- arch/x86/kernel/cpu/bugs.c | 304 ++++++++++++++++++++++-- arch/x86/kernel/process.c | 58 ++++- arch/x86/kvm/vmx.c | 2 +- arch/x86/mm/tlb.c | 23 +- fs/exec.c | 3 + include/linux/cpu.h | 31 ++- include/linux/sched.h | 9 + include/uapi/linux/prctl.h | 1 + kernel/cpu.c | 28 ++- kernel/cred.c | 5 +- kernel/sched/core.c | 19 +- kernel/sched/sched.h | 2 - kernel/sys.c | 7 + tools/include/uapi/linux/prctl.h | 1 + 21 files changed, 512 insertions(+), 64 deletions(-)
The Spectre V2 printout in cpu_show_common() handles conditionals for the various mitigation methods directly in the sprintf() argument list. That's hard to read and will become unreadable if more complex decisions need to be made for a particular method.
Move the conditionals for STIBP and IBPB string selection into helper functions, so they can be extended later on.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- arch/x86/kernel/cpu/bugs.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 84e3579..91a754a 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -853,6 +853,22 @@ static ssize_t l1tf_show_state(char *buf) } #endif
+static char *stibp_state(void) +{ + if (x86_spec_ctrl_base & SPEC_CTRL_STIBP) + return ", STIBP"; + else + return ""; +} + +static char *ibpb_state(void) +{ + if (boot_cpu_has(X86_FEATURE_USE_IBPB)) + return ", IBPB"; + else + return ""; +} + static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, char *buf, unsigned int bug) { @@ -874,9 +890,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
case X86_BUG_SPECTRE_V2: return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], - boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "", + ibpb_state(), boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", - (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "", + stibp_state(), boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "", spectre_v2_module_string());
STIBP is not needed when enhanced IBRS is used for Spectre V2 mitigation. A CPU feature flag to indicate that enhanced IBRS is used will be handy for skipping STIBP for this case.
Add X86_FEATURE_USE_IBRS_ENHANCED feature bit to indicate enhanced IBRS is used for Spectre V2 mitigation.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/bugs.c | 1 + 2 files changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 28c4a50..fe8e064 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -221,6 +221,7 @@ #define X86_FEATURE_ZEN ( 7*32+28) /* "" CPU is AMD family 0x17 (Zen) */ #define X86_FEATURE_L1TF_PTEINV ( 7*32+29) /* "" L1TF workaround PTE inversion */ #define X86_FEATURE_IBRS_ENHANCED ( 7*32+30) /* Enhanced IBRS */ +#define X86_FEATURE_USE_IBRS_ENHANCED ( 7*32+31) /* "" Enhanced IBRS enabled */
/* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 91a754a..3a6f13b 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -387,6 +387,7 @@ static void __init spectre_v2_select_mitigation(void) /* Force it so VMEXIT will restore correctly */ x86_spec_ctrl_base |= SPEC_CTRL_IBRS; wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); + setup_force_cpu_cap(X86_FEATURE_USE_IBRS_ENHANCED); goto specv2_set_mode; } if (IS_ENABLED(CONFIG_RETPOLINE))
If enhanced IBRS is engaged, STIBP is redundant in mitigating Spectre v2 user space exploits from hyperthread sibling.
Disable STIBP when enhanced IBRS is used.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- arch/x86/kernel/cpu/bugs.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 3a6f13b..199f27e 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -323,11 +323,16 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) return cmd; }
+/* Determine if STIBP should be always on. */ static bool stibp_needed(void) { if (spectre_v2_enabled == SPECTRE_V2_NONE) return false;
+ /* Using enhanced IBRS makes using STIBP unnecessary. */ + if (static_cpu_has(X86_FEATURE_USE_IBRS_ENHANCED)) + return false; + if (!boot_cpu_has(X86_FEATURE_STIBP)) return false;
@@ -856,6 +861,9 @@ static ssize_t l1tf_show_state(char *buf)
static char *stibp_state(void) { + if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) + return ""; + if (x86_spec_ctrl_base & SPEC_CTRL_STIBP) return ", STIBP"; else
During context switch, the SSBD bit in SPEC_CTRL MSR is updated according to changes in TIF_SSBD flag in the current and next running task. Currently, only the bit controlling speculative store in SPEC_CTRL MSR is updated and the related update functions all have "speculative_store" or "ssb" in their names.
In later patches, other bits controlling STIBP in SPEC_CTRL MSR need to be updated. The SPEC_CTRL MSR update functions should get rid of the speculative store names as they will no longer be limited to SSBD update.
Rename the "speculative_store*" functions to a more generic name.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- arch/x86/include/asm/spec-ctrl.h | 6 +++--- arch/x86/kernel/cpu/bugs.c | 4 ++-- arch/x86/kernel/process.c | 12 ++++++------ 3 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h index ae7c2c5..8e2f841 100644 --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -70,11 +70,11 @@ extern void speculative_store_bypass_ht_init(void); static inline void speculative_store_bypass_ht_init(void) { } #endif
-extern void speculative_store_bypass_update(unsigned long tif); +extern void speculation_ctrl_update(unsigned long tif);
-static inline void speculative_store_bypass_update_current(void) +static inline void speculation_ctrl_update_current(void) { - speculative_store_bypass_update(current_thread_info()->flags); + speculation_ctrl_update(current_thread_info()->flags); }
#endif diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 199f27e..a63456a 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -202,7 +202,7 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) tif = setguest ? ssbd_spec_ctrl_to_tif(guestval) : ssbd_spec_ctrl_to_tif(hostval);
- speculative_store_bypass_update(tif); + speculation_ctrl_update(tif); } } EXPORT_SYMBOL_GPL(x86_virt_spec_ctrl); @@ -643,7 +643,7 @@ static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) * mitigation until it is next scheduled. */ if (task == current && update) - speculative_store_bypass_update_current(); + speculation_ctrl_update_current();
return 0; } diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index c93fcfd..8aa4960 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -395,27 +395,27 @@ static __always_inline void amd_set_ssb_virt_state(unsigned long tifn) wrmsrl(MSR_AMD64_VIRT_SPEC_CTRL, ssbd_tif_to_spec_ctrl(tifn)); }
-static __always_inline void intel_set_ssb_state(unsigned long tifn) +static __always_inline void spec_ctrl_update_msr(unsigned long tifn) { u64 msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn);
wrmsrl(MSR_IA32_SPEC_CTRL, msr); }
-static __always_inline void __speculative_store_bypass_update(unsigned long tifn) +static __always_inline void __speculation_ctrl_update(unsigned long tifn) { if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) amd_set_ssb_virt_state(tifn); else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) amd_set_core_ssb_state(tifn); else - intel_set_ssb_state(tifn); + spec_ctrl_update_msr(tifn); }
-void speculative_store_bypass_update(unsigned long tif) +void speculation_ctrl_update(unsigned long tif) { preempt_disable(); - __speculative_store_bypass_update(tif); + __speculation_ctrl_update(tif); preempt_enable(); }
@@ -452,7 +452,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, set_cpuid_faulting(!!(tifn & _TIF_NOCPUID));
if ((tifp ^ tifn) & _TIF_SSBD) - __speculative_store_bypass_update(tifn); + __speculation_ctrl_update(tifn); }
/*
The logic to detect whether there's a change in the previous and next task's flag relevant to update speculation control MSRs are spread out across multiple functions.
Consolidate all checks needed for updating speculation control MSRs to __speculation_ctrl_update().
This makes it easy to pick the right speculation control MSR, and the bits in the MSR that needs updating based on TIF flags changes.
Originally-by: Thomas Lendacky Thomas.Lendacky@amd.com Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- arch/x86/kernel/process.c | 44 +++++++++++++++++++++++++++++++++----------- 1 file changed, 33 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 8aa4960..74bef48 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -397,25 +397,48 @@ static __always_inline void amd_set_ssb_virt_state(unsigned long tifn)
static __always_inline void spec_ctrl_update_msr(unsigned long tifn) { - u64 msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn); + u64 msr = x86_spec_ctrl_base; + + /* + * If X86_FEATURE_SSBD is not set, the SSBD + * bit is not to be touched. + */ + if (static_cpu_has(X86_FEATURE_SSBD)) + msr |= ssbd_tif_to_spec_ctrl(tifn);
wrmsrl(MSR_IA32_SPEC_CTRL, msr); }
-static __always_inline void __speculation_ctrl_update(unsigned long tifn) -{ - if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) - amd_set_ssb_virt_state(tifn); - else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) - amd_set_core_ssb_state(tifn); - else +/* + * Update the MSRs managing speculation control during context switch. + * + * tifp: previous task's thread flags + * tifn: next task's thread flags + */ +static __always_inline void __speculation_ctrl_update(unsigned long tifp, + unsigned long tifn) +{ + bool updmsr = false; + + /* If TIF_SSBD is different, select the proper mitigation method */ + if ((tifp ^ tifn) & _TIF_SSBD) { + if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) + amd_set_ssb_virt_state(tifn); + else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) + amd_set_core_ssb_state(tifn); + else if (static_cpu_has(X86_FEATURE_SSBD)) + updmsr = true; + } + + if (updmsr) spec_ctrl_update_msr(tifn); }
void speculation_ctrl_update(unsigned long tif) { + /* Forced update. Make sure all relevant TIF flags are different */ preempt_disable(); - __speculation_ctrl_update(tif); + __speculation_ctrl_update(~tif, tif); preempt_enable(); }
@@ -451,8 +474,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, if ((tifp ^ tifn) & _TIF_NOCPUID) set_cpuid_faulting(!!(tifn & _TIF_NOCPUID));
- if ((tifp ^ tifn) & _TIF_SSBD) - __speculation_ctrl_update(tifn); + __speculation_ctrl_update(tifp, tifn); }
/*
In later code, STIBP will be turned on/off in the context switch code path when SMT is enabled. Checks for SMT is best avoided on such hot paths.
Create cpu_smt_enabled static key to turn on such SMT specific code statically.
The key is enabled by default and its scope has nothing to do with CPU hotplug. It depends on CONFIG_HOTPLUG_SMT which is the extra SMT control code which is currently only enabled on x86.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- include/linux/cpu.h | 4 ++++ kernel/cpu.c | 12 ++++++++++-- 2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 218df7f..ce8267e 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -189,4 +189,8 @@ static inline void cpu_smt_check_topology_early(void) { } static inline void cpu_smt_check_topology(void) { } #endif
+#ifdef CONFIG_HOTPLUG_SMT +DECLARE_STATIC_KEY_TRUE(cpu_smt_enabled); +#endif + #endif /* _LINUX_CPU_H_ */ diff --git a/kernel/cpu.c b/kernel/cpu.c index 3c7f3b4..e216154 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -370,6 +370,8 @@ static void lockdep_release_cpus_lock(void) #ifdef CONFIG_HOTPLUG_SMT enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED; EXPORT_SYMBOL_GPL(cpu_smt_control); +DEFINE_STATIC_KEY_TRUE(cpu_smt_enabled); +EXPORT_SYMBOL_GPL(cpu_smt_enabled);
static bool cpu_smt_available __read_mostly;
@@ -386,6 +388,7 @@ void __init cpu_smt_disable(bool force) pr_info("SMT: disabled\n"); cpu_smt_control = CPU_SMT_DISABLED; } + static_branch_disable(&cpu_smt_enabled); }
/* @@ -395,8 +398,10 @@ void __init cpu_smt_disable(bool force) */ void __init cpu_smt_check_topology_early(void) { - if (!topology_smt_supported()) + if (!topology_smt_supported()) { cpu_smt_control = CPU_SMT_NOT_SUPPORTED; + static_branch_disable(&cpu_smt_enabled); + } }
/* @@ -408,8 +413,10 @@ void __init cpu_smt_check_topology_early(void) */ void __init cpu_smt_check_topology(void) { - if (!cpu_smt_available) + if (!cpu_smt_available) { cpu_smt_control = CPU_SMT_NOT_SUPPORTED; + static_branch_disable(&cpu_smt_enabled); + } }
static int __init smt_cmdline_disable(char *str) @@ -2101,6 +2108,7 @@ static int cpuhp_smt_enable(void)
cpu_maps_update_begin(); cpu_smt_control = CPU_SMT_ENABLED; + static_branch_enable(&cpu_smt_enabled); arch_smt_update(); for_each_present_cpu(cpu) { /* Skip online CPUs and CPUs on offline nodes */
The checks to cpu_smt_control outside of kernel/cpu.c can be converted to use cpu_smt_enabled key to run SMT specific code.
Save the export of cpu_smt_control and convert usage of cpu_smt_control to cpu_smt_enabled outside of kernel/cpu.c.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- arch/x86/kernel/cpu/bugs.c | 13 +++++++------ arch/x86/kvm/vmx.c | 2 +- include/linux/cpu.h | 12 +++--------- kernel/cpu.c | 11 +++++++++-- 4 files changed, 20 insertions(+), 18 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index a63456a..3e5ae2c 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -353,15 +353,16 @@ void arch_smt_update(void)
mutex_lock(&spec_ctrl_mutex); mask = x86_spec_ctrl_base; - if (cpu_smt_control == CPU_SMT_ENABLED) + if (cpu_use_smt_and_hotplug) mask |= SPEC_CTRL_STIBP; else mask &= ~SPEC_CTRL_STIBP;
if (mask != x86_spec_ctrl_base) { - pr_info("Spectre v2 cross-process SMT mitigation: %s STIBP\n", - cpu_smt_control == CPU_SMT_ENABLED ? - "Enabling" : "Disabling"); + if (cpu_use_smt_and_hotplug) + pr_info("Spectre v2 cross-process SMT mitigation: Enabling STIBP\n"); + else + pr_info("Spectre v2 cross-process SMT mitigation: Disabling STIBP\n"); x86_spec_ctrl_base = mask; on_each_cpu(update_stibp_msr, NULL, 1); } @@ -844,13 +845,13 @@ static ssize_t l1tf_show_state(char *buf)
if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_EPT_DISABLED || (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_NEVER && - cpu_smt_control == CPU_SMT_ENABLED)) + cpu_use_smt_and_hotplug)) return sprintf(buf, "%s; VMX: %s\n", L1TF_DEFAULT_MSG, l1tf_vmx_states[l1tf_vmx_mitigation]);
return sprintf(buf, "%s; VMX: %s, SMT %s\n", L1TF_DEFAULT_MSG, l1tf_vmx_states[l1tf_vmx_mitigation], - cpu_smt_control == CPU_SMT_ENABLED ? "vulnerable" : "disabled"); + cpu_use_smt_and_hotplug ? "vulnerable" : "disabled"); } #else static ssize_t l1tf_show_state(char *buf) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 4555077..6c71d4c 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11607,7 +11607,7 @@ static int vmx_vm_init(struct kvm *kvm) * Warn upon starting the first VM in a potentially * insecure environment. */ - if (cpu_smt_control == CPU_SMT_ENABLED) + if (cpu_use_smt_and_hotplug) pr_warn_once(L1TF_MSG_SMT); if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_NEVER) pr_warn_once(L1TF_MSG_L1D); diff --git a/include/linux/cpu.h b/include/linux/cpu.h index ce8267e..6f43024 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -170,20 +170,14 @@ void cpuhp_report_idle_dead(void); static inline void cpuhp_report_idle_dead(void) { } #endif /* #ifdef CONFIG_HOTPLUG_CPU */
-enum cpuhp_smt_control { - CPU_SMT_ENABLED, - CPU_SMT_DISABLED, - CPU_SMT_FORCE_DISABLED, - CPU_SMT_NOT_SUPPORTED, -}; - #if defined(CONFIG_SMP) && defined(CONFIG_HOTPLUG_SMT) -extern enum cpuhp_smt_control cpu_smt_control; +DECLARE_STATIC_KEY_TRUE(cpu_smt_enabled); +#define cpu_use_smt_and_hotplug (static_branch_likely(&cpu_smt_enabled)) extern void cpu_smt_disable(bool force); extern void cpu_smt_check_topology_early(void); extern void cpu_smt_check_topology(void); #else -# define cpu_smt_control (CPU_SMT_ENABLED) +#define cpu_use_smt_and_hotplug false static inline void cpu_smt_disable(bool force) { } static inline void cpu_smt_check_topology_early(void) { } static inline void cpu_smt_check_topology(void) { } diff --git a/kernel/cpu.c b/kernel/cpu.c index e216154..f846416 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -368,8 +368,15 @@ static void lockdep_release_cpus_lock(void) #endif /* CONFIG_HOTPLUG_CPU */
#ifdef CONFIG_HOTPLUG_SMT -enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED; -EXPORT_SYMBOL_GPL(cpu_smt_control); + +enum cpuhp_smt_control { + CPU_SMT_ENABLED, + CPU_SMT_DISABLED, + CPU_SMT_FORCE_DISABLED, + CPU_SMT_NOT_SUPPORTED, +}; + +static enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED; DEFINE_STATIC_KEY_TRUE(cpu_smt_enabled); EXPORT_SYMBOL_GPL(cpu_smt_enabled);
If STIBP is used all the time, tasks that do not need STIBP protection will get unnecessarily slowed down by STIBP.
To apply STIBP only to tasks that need it, a new task TIF_STIBP flag is created. A x86 CPU uses STIBP only for tasks labeled with TIF_STIBP.
During context switch, this flag is checked and the STIBP bit in SPEC_CTRL MSR is updated according to changes in this flag between previous and next task.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- arch/x86/include/asm/msr-index.h | 6 +++++- arch/x86/include/asm/spec-ctrl.h | 12 ++++++++++++ arch/x86/include/asm/thread_info.h | 5 ++++- arch/x86/kernel/process.c | 14 +++++++++++++- 4 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 80f4a4f..501c9d3 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -41,7 +41,11 @@
#define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */ #define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */ -#define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */ +#define SPEC_CTRL_STIBP_SHIFT 1 /* + * Single Thread Indirect Branch + * Predictor (STIBP) bit + */ +#define SPEC_CTRL_STIBP (1 << SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */ #define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */ #define SPEC_CTRL_SSBD (1 << SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h index 8e2f841..41b993e 100644 --- a/arch/x86/include/asm/spec-ctrl.h +++ b/arch/x86/include/asm/spec-ctrl.h @@ -53,12 +53,24 @@ static inline u64 ssbd_tif_to_spec_ctrl(u64 tifn) return (tifn & _TIF_SSBD) >> (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT); }
+static inline u64 stibp_tif_to_spec_ctrl(u64 tifn) +{ + BUILD_BUG_ON(TIF_SPEC_INDIR_BRANCH < SPEC_CTRL_STIBP_SHIFT); + return (tifn & _TIF_SPEC_INDIR_BRANCH) >> (TIF_SPEC_INDIR_BRANCH - SPEC_CTRL_STIBP_SHIFT); +} + static inline unsigned long ssbd_spec_ctrl_to_tif(u64 spec_ctrl) { BUILD_BUG_ON(TIF_SSBD < SPEC_CTRL_SSBD_SHIFT); return (spec_ctrl & SPEC_CTRL_SSBD) << (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT); }
+static inline unsigned long stibp_spec_ctrl_to_tif(u64 spec_ctrl) +{ + BUILD_BUG_ON(TIF_SPEC_INDIR_BRANCH < SPEC_CTRL_STIBP_SHIFT); + return (spec_ctrl & SPEC_CTRL_STIBP) << (TIF_SPEC_INDIR_BRANCH - SPEC_CTRL_STIBP_SHIFT); +} + static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn) { return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL; diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index 2ff2a30..b3032c7 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -83,6 +83,7 @@ struct thread_info { #define TIF_SYSCALL_EMU 6 /* syscall emulation active */ #define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */ #define TIF_SECCOMP 8 /* secure computing */ +#define TIF_SPEC_INDIR_BRANCH 9 /* Indirect branch speculation restricted */ #define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */ #define TIF_UPROBE 12 /* breakpointed or singlestepping */ #define TIF_PATCH_PENDING 13 /* pending live patching update */ @@ -110,6 +111,7 @@ struct thread_info { #define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU) #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) #define _TIF_SECCOMP (1 << TIF_SECCOMP) +#define _TIF_SPEC_INDIR_BRANCH (1 << TIF_SPEC_INDIR_BRANCH) #define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY) #define _TIF_UPROBE (1 << TIF_UPROBE) #define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING) @@ -146,7 +148,8 @@ struct thread_info {
/* flags to check in __switch_to() */ #define _TIF_WORK_CTXSW \ - (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP|_TIF_SSBD) + (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP| \ + _TIF_SSBD|_TIF_SPEC_INDIR_BRANCH)
#define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY) #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 74bef48..48fcd46 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -406,6 +406,8 @@ static __always_inline void spec_ctrl_update_msr(unsigned long tifn) if (static_cpu_has(X86_FEATURE_SSBD)) msr |= ssbd_tif_to_spec_ctrl(tifn);
+ msr |= stibp_tif_to_spec_ctrl(tifn); + wrmsrl(MSR_IA32_SPEC_CTRL, msr); }
@@ -418,7 +420,17 @@ static __always_inline void spec_ctrl_update_msr(unsigned long tifn) static __always_inline void __speculation_ctrl_update(unsigned long tifp, unsigned long tifn) { - bool updmsr = false; + bool updmsr; + + /* + * Need STIBP defense against Spectre v2 attack + * if SMT is in use and enhanced IBRS is unsupported. + */ + if (static_cpu_has(X86_FEATURE_STIBP) && cpu_use_smt_and_hotplug && + !static_cpu_has(X86_FEATURE_USE_IBRS_ENHANCED)) + updmsr = !!((tifp ^ tifn) & _TIF_SPEC_INDIR_BRANCH); + else + updmsr = false;
/* If TIF_SSBD is different, select the proper mitigation method */ if ((tifp ^ tifn) & _TIF_SSBD) {
Add new protection modes for Spectre v2 mitigations against Spectre v2 attacks on user processes. There are three modes:
none mode: In this mode, no mitigations are deployed.
strict mode: In this mode, IBPB and STIBP are deployed on on all tasks.
prctl mode: In this mode, IBPB and STIBP are only deployed on tasks that choose to restrict indirect branch speculation via prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIR_BRANCH, PR_SPEC_ENABLE, 0, 0);
The protection mode can be specified by the spectre_v2_app2app boot parameter with the following semantics:
spectre_v2_app2app= on - Unconditionally enable mitigations for all tasks. off - Unconditionally disable mitigations for all tasks. auto - Kernel detects whether the CPU model has IBPB and STIBP mitigations against Spectre V2 attacks. If the CPU is not vulnerable, "off" is selected. If the CPU is vulnerable, the default mitigation is "prctl". See below. prctl - Enable mitigations per thread by restricting indirect branch speculation via prctl. Mitigation for a thread is not enabled by default to avoid mitigation overhead. The state of of the control is inherited on fork.
Not specifying this option is equivalent to spectre_v2_app2app=auto.
Setting spectre_v2=off implies spectre_v2_app2app=off and spectre_v2_app2app boot parameter is ignored.
Setting spectre_v2=on implies spectre_v2_app2app=on and spectre_v2_app2app boot parameter is ignored.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- Documentation/admin-guide/kernel-parameters.txt | 28 +++++ arch/x86/include/asm/nospec-branch.h | 9 ++ arch/x86/kernel/cpu/bugs.c | 135 ++++++++++++++++++++++-- 3 files changed, 166 insertions(+), 6 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 81d1d5a..d2255f7 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4215,6 +4215,34 @@ Not specifying this option is equivalent to spectre_v2=auto.
+ spectre_v2_app2app= + [X86] Control mitigation of Spectre variant 2 + userspace vulnerability due to indirect branch speculation. + + The options are: + + on - Unconditionally enable mitigations for all tasks. + off - Unconditionally disable mitigations for all tasks. + auto - Kernel detects whether the CPU model has IBPB + and STIBP mitigations against Spectre V2 attacks. + If the CPU is not vulnerable, "off" is selected. + If the CPU is vulnerable, the default mitigation + is "prctl". + prctl - Enable mitigations per thread by restricting + indirect branch speculation via prctl. + Mitigation for a thread is not enabled by default to + avoid mitigation overhead. The state of + of the control is inherited on fork. + + Not specifying this option is equivalent to + spectre_v2_app2app=auto. + + Setting spectre_v2=off implies spectre_v2_app2app=off and + spectre_v2_app2app boot parameter is ignored. + + Setting spectre_v2=on implies spectre_v2_app2app=on and + spectre_v2_app2app boot parameter is ignored. + spec_store_bypass_disable= [HW] Control Speculative Store Bypass (SSB) Disable mitigation (Speculative Store Bypass vulnerability) diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 80dc144..69d2657 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -3,6 +3,7 @@ #ifndef _ASM_X86_NOSPEC_BRANCH_H_ #define _ASM_X86_NOSPEC_BRANCH_H_
+#include <linux/static_key.h> #include <asm/alternative.h> #include <asm/alternative-asm.h> #include <asm/cpufeatures.h> @@ -226,6 +227,12 @@ enum spectre_v2_mitigation { SPECTRE_V2_IBRS_ENHANCED, };
+enum spectre_v2_app2app_mitigation { + SPECTRE_V2_APP2APP_NONE, + SPECTRE_V2_APP2APP_STRICT, + SPECTRE_V2_APP2APP_PRCTL, +}; + /* The Speculative Store Bypass disable variants */ enum ssb_mitigation { SPEC_STORE_BYPASS_NONE, @@ -237,6 +244,8 @@ enum ssb_mitigation { extern char __indirect_thunk_start[]; extern char __indirect_thunk_end[];
+DECLARE_STATIC_KEY_FALSE(spectre_v2_app_lite); + /* * On VMEXIT we must ensure that no RSB predictions learned in the guest * can be followed in the host, by overwriting the RSB completely. Both diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 3e5ae2c..387de54 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -133,6 +133,13 @@ enum spectre_v2_mitigation_cmd { SPECTRE_V2_CMD_RETPOLINE_AMD, };
+enum spectre_v2_app2app_mitigation_cmd { + SPECTRE_V2_APP2APP_CMD_NONE, + SPECTRE_V2_APP2APP_CMD_FORCE, + SPECTRE_V2_APP2APP_CMD_AUTO, + SPECTRE_V2_APP2APP_CMD_PRCTL, +}; + static const char *spectre_v2_strings[] = { [SPECTRE_V2_NONE] = "Vulnerable", [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline", @@ -142,12 +149,24 @@ static const char *spectre_v2_strings[] = { [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS", };
+static const char *spectre_v2_app2app_strings[] = { + [SPECTRE_V2_APP2APP_NONE] = "App-App Vulnerable", + [SPECTRE_V2_APP2APP_STRICT] = "App-App Mitigation: Full app to app attack protection", + [SPECTRE_V2_APP2APP_PRCTL] = "App-App Mitigation: Protect branch speculation restricted tasks", +}; + +/* Lightweight mitigation: mitigate only tasks with TIF_SPEC_INDIR_BRANCH */ +DEFINE_STATIC_KEY_FALSE(spectre_v2_app_lite); + #undef pr_fmt #define pr_fmt(fmt) "Spectre V2 : " fmt
static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init = SPECTRE_V2_NONE;
+static enum spectre_v2_app2app_mitigation + spectre_v2_app2app_enabled __ro_after_init = SPECTRE_V2_APP2APP_NONE; + void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) { @@ -169,6 +188,9 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) static_cpu_has(X86_FEATURE_AMD_SSBD)) hostval |= ssbd_tif_to_spec_ctrl(ti->flags);
+ if (static_branch_unlikely(&spectre_v2_app_lite)) + hostval |= stibp_tif_to_spec_ctrl(ti->flags); + if (hostval != guestval) { msrval = setguest ? guestval : hostval; wrmsrl(MSR_IA32_SPEC_CTRL, msrval); @@ -275,6 +297,60 @@ static const struct { { "auto", SPECTRE_V2_CMD_AUTO, false }, };
+static const struct { + const char *option; + enum spectre_v2_app2app_mitigation_cmd cmd; + bool secure; +} app2app_options[] = { + { "off", SPECTRE_V2_APP2APP_CMD_NONE, false }, + { "on", SPECTRE_V2_APP2APP_CMD_FORCE, true }, + { "auto", SPECTRE_V2_APP2APP_CMD_AUTO, false }, + { "prctl", SPECTRE_V2_APP2APP_CMD_PRCTL, false }, +}; + +static enum spectre_v2_app2app_mitigation_cmd __init + spectre_v2_parse_app2app_cmdline(enum spectre_v2_mitigation_cmd v2_cmd) +{ + enum spectre_v2_app2app_mitigation_cmd cmd = SPECTRE_V2_APP2APP_CMD_AUTO; + char arg[20]; + int ret, i; + + if (v2_cmd == SPECTRE_V2_CMD_FORCE) { + cmd = SPECTRE_V2_APP2APP_CMD_FORCE; + goto show_cmd; + } + + if (v2_cmd == SPECTRE_V2_CMD_NONE) { + cmd = SPECTRE_V2_APP2APP_CMD_NONE; + goto show_cmd; + } + + ret = cmdline_find_option(boot_command_line, "spectre_v2_app2app", + arg, sizeof(arg)); + if (ret < 0) + return SPECTRE_V2_APP2APP_CMD_AUTO; + + for (i = 0; i < ARRAY_SIZE(app2app_options); i++) { + if (!match_option(arg, ret, app2app_options[i].option)) + continue; + cmd = app2app_options[i].cmd; + break; + } + + if (i >= ARRAY_SIZE(app2app_options)) { + pr_err("unknown app to app protection option (%s). Switching to AUTO select\n", arg); + return SPECTRE_V2_APP2APP_CMD_AUTO; + } + +show_cmd: + if (app2app_options[cmd].secure) + spec2_print_if_secure(app2app_options[cmd].option); + else + spec2_print_if_insecure(app2app_options[cmd].option); + + return cmd; +} + static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) { char arg[20]; @@ -326,13 +402,20 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) /* Determine if STIBP should be always on. */ static bool stibp_needed(void) { - if (spectre_v2_enabled == SPECTRE_V2_NONE) + if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_NONE) return false;
/* Using enhanced IBRS makes using STIBP unnecessary. */ if (static_cpu_has(X86_FEATURE_USE_IBRS_ENHANCED)) return false;
+ /* + * For lite option, STIBP is used only for task with + * TIF_SPEC_INDIR_BRANCH flag. STIBP is not always on for that case. + */ + if (static_branch_unlikely(&spectre_v2_app_lite)) + return false; + if (!boot_cpu_has(X86_FEATURE_STIBP)) return false;
@@ -373,6 +456,8 @@ static void __init spectre_v2_select_mitigation(void) { enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline(); enum spectre_v2_mitigation mode = SPECTRE_V2_NONE; + enum spectre_v2_app2app_mitigation_cmd app2app_cmd; + enum spectre_v2_app2app_mitigation app2app_mode;
/* * If the CPU is not affected and the command line mode is NONE or AUTO @@ -471,6 +556,43 @@ static void __init spectre_v2_select_mitigation(void) pr_info("Enabling Restricted Speculation for firmware calls\n"); }
+ app2app_mode = SPECTRE_V2_APP2APP_NONE; + + /* No mitigation if mitigation feature is unavailable */ + if (!boot_cpu_has(X86_FEATURE_STIBP)) + goto set_app2app_mode; + + app2app_cmd = spectre_v2_parse_app2app_cmdline(cmd); + + /* + * If the CPU is not affected and the command line mode is NONE or AUTO + * then no mitigation used. + */ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2) && + (app2app_cmd == SPECTRE_V2_APP2APP_CMD_NONE || + app2app_cmd == SPECTRE_V2_APP2APP_CMD_AUTO)) + goto set_app2app_mode; + + switch (app2app_cmd) { + case SPECTRE_V2_APP2APP_CMD_NONE: + break; + + case SPECTRE_V2_APP2APP_CMD_PRCTL: + case SPECTRE_V2_APP2APP_CMD_AUTO: + app2app_mode = SPECTRE_V2_APP2APP_PRCTL; + break; + + case SPECTRE_V2_APP2APP_CMD_FORCE: + app2app_mode = SPECTRE_V2_APP2APP_STRICT; + break; + } + +set_app2app_mode: + spectre_v2_app2app_enabled = app2app_mode; + pr_info("%s\n", spectre_v2_app2app_strings[app2app_mode]); + if (app2app_mode == SPECTRE_V2_APP2APP_PRCTL) + static_branch_enable(&spectre_v2_app_lite); + /* Enable STIBP if appropriate */ arch_smt_update(); } @@ -862,13 +984,14 @@ static ssize_t l1tf_show_state(char *buf)
static char *stibp_state(void) { - if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) + if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED || + spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_NONE || + !cpu_use_smt_and_hotplug) return ""; - - if (x86_spec_ctrl_base & SPEC_CTRL_STIBP) - return ", STIBP"; + else if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_PRCTL) + return ", STIBP-prctl"; else - return ""; + return ", STIBP-all"; }
static char *ibpb_state(void)
Create PRCTL interface to restrict an application's indirect branch speculation. This will protect the application against spectre v2 attack from another application.
Invocations: Check indirect branch speculation status with - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIR_BRANCH, 0, 0, 0);
Enable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIR_BRANCH, PR_SPEC_ENABLE, 0, 0);
Disable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIR_BRANCH, PR_SPEC_DISABLE, 0, 0);
Force disable indirect branch speculation with - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIR_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
See Documentation/userspace-api/spec_ctrl.rst.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- Documentation/userspace-api/spec_ctrl.rst | 9 ++++ arch/x86/kernel/cpu/bugs.c | 80 +++++++++++++++++++++++++++++++ include/linux/sched.h | 9 ++++ include/uapi/linux/prctl.h | 1 + tools/include/uapi/linux/prctl.h | 1 + 5 files changed, 100 insertions(+)
diff --git a/Documentation/userspace-api/spec_ctrl.rst b/Documentation/userspace-api/spec_ctrl.rst index 32f3d55..8a4e268 100644 --- a/Documentation/userspace-api/spec_ctrl.rst +++ b/Documentation/userspace-api/spec_ctrl.rst @@ -92,3 +92,12 @@ Speculation misfeature controls * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE, 0, 0); * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE, 0, 0); * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_FORCE_DISABLE, 0, 0); + +- PR_SPEC_INDIR_BRANCH: Indirect Branch Speculation in User Processes + (Mitigate Spectre V2 style attacks against user processes) + + Invocations: + * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIR_BRANCH, 0, 0, 0); + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIR_BRANCH, PR_SPEC_ENABLE, 0, 0); + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIR_BRANCH, PR_SPEC_DISABLE, 0, 0); + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIR_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0); diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 387de54..26e1a87 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -771,12 +771,69 @@ static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) return 0; }
+static void set_task_restrict_indir_branch(struct task_struct *tsk, bool restrict_on) +{ + bool update = false; + + if (restrict_on) + update = !test_and_set_tsk_thread_flag(tsk, TIF_SPEC_INDIR_BRANCH); + else + update = test_and_clear_tsk_thread_flag(tsk, TIF_SPEC_INDIR_BRANCH); + + if (tsk == current && update) + speculation_ctrl_update_current(); +} + +static int indir_branch_prctl_set(struct task_struct *task, unsigned long ctrl) +{ + switch (ctrl) { + case PR_SPEC_ENABLE: + if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_NONE) + return 0; + /* + * Indirect branch speculation is always disabled in + * strict mode. + */ + if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_STRICT) + return -EPERM; + task_clear_spec_indir_branch_disable(task); + set_task_restrict_indir_branch(task, false); + break; + case PR_SPEC_DISABLE: + /* + * Indirect branch speculation is always allowed when + * mitigation is force disabled. + */ + if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_NONE) + return -EPERM; + if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_STRICT) + return 0; + task_set_spec_indir_branch_disable(task); + set_task_restrict_indir_branch(task, true); + break; + case PR_SPEC_FORCE_DISABLE: + if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_NONE) + return -EPERM; + if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_STRICT) + return 0; + task_set_spec_indir_branch_disable(task); + task_set_spec_indir_branch_force_disable(task); + set_task_restrict_indir_branch(task, true); + break; + default: + return -ERANGE; + } + return 0; +} + int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, unsigned long ctrl) { switch (which) { case PR_SPEC_STORE_BYPASS: return ssb_prctl_set(task, ctrl); + case PR_SPEC_INDIR_BRANCH: + return indir_branch_prctl_set(task, ctrl); default: return -ENODEV; } @@ -809,11 +866,34 @@ static int ssb_prctl_get(struct task_struct *task) } }
+static int indir_branch_prctl_get(struct task_struct *task) +{ + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) + return PR_SPEC_NOT_AFFECTED; + + switch (spectre_v2_app2app_enabled) { + case SPECTRE_V2_APP2APP_NONE: + return PR_SPEC_ENABLE; + case SPECTRE_V2_APP2APP_PRCTL: + if (task_spec_indir_branch_force_disable(task)) + return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE; + if (test_tsk_thread_flag(task, TIF_SPEC_INDIR_BRANCH)) + return PR_SPEC_PRCTL | PR_SPEC_DISABLE; + return PR_SPEC_PRCTL | PR_SPEC_ENABLE; + case SPECTRE_V2_APP2APP_STRICT: + return PR_SPEC_DISABLE; + default: + return PR_SPEC_NOT_AFFECTED; + } +} + int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which) { switch (which) { case PR_SPEC_STORE_BYPASS: return ssb_prctl_get(task); + case PR_SPEC_INDIR_BRANCH: + return indir_branch_prctl_get(task); default: return -ENODEV; } diff --git a/include/linux/sched.h b/include/linux/sched.h index a51c13c..e92e4bf 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1453,6 +1453,8 @@ static inline bool is_percpu_thread(void) #define PFA_SPREAD_SLAB 2 /* Spread some slab caches over cpuset */ #define PFA_SPEC_SSB_DISABLE 3 /* Speculative Store Bypass disabled */ #define PFA_SPEC_SSB_FORCE_DISABLE 4 /* Speculative Store Bypass force disabled*/ +#define PFA_SPEC_INDIR_BRANCH_DISABLE 5 /* Indirect branch speculation restricted in apps */ +#define PFA_SPEC_INDIR_BRANCH_FORCE_DISABLE 6 /* Indirect branch speculation restricted in apps forced */
#define TASK_PFA_TEST(name, func) \ static inline bool task_##func(struct task_struct *p) \ @@ -1484,6 +1486,13 @@ TASK_PFA_CLEAR(SPEC_SSB_DISABLE, spec_ssb_disable) TASK_PFA_TEST(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable) TASK_PFA_SET(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable)
+TASK_PFA_TEST(SPEC_INDIR_BRANCH_DISABLE, spec_indir_branch_disable) +TASK_PFA_SET(SPEC_INDIR_BRANCH_DISABLE, spec_indir_branch_disable) +TASK_PFA_CLEAR(SPEC_INDIR_BRANCH_DISABLE, spec_indir_branch_disable) + +TASK_PFA_TEST(SPEC_INDIR_BRANCH_FORCE_DISABLE, spec_indir_branch_force_disable) +TASK_PFA_SET(SPEC_INDIR_BRANCH_FORCE_DISABLE, spec_indir_branch_force_disable) + static inline void current_restore_flags(unsigned long orig_flags, unsigned long flags) { diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index c0d7ea0..577f2ca 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -212,6 +212,7 @@ struct prctl_mm_map { #define PR_SET_SPECULATION_CTRL 53 /* Speculation control variants */ # define PR_SPEC_STORE_BYPASS 0 +# define PR_SPEC_INDIR_BRANCH 1 /* Return and control values for PR_SET/GET_SPECULATION_CTRL */ # define PR_SPEC_NOT_AFFECTED 0 # define PR_SPEC_PRCTL (1UL << 0) diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h index c0d7ea0..577f2ca 100644 --- a/tools/include/uapi/linux/prctl.h +++ b/tools/include/uapi/linux/prctl.h @@ -212,6 +212,7 @@ struct prctl_mm_map { #define PR_SET_SPECULATION_CTRL 53 /* Speculation control variants */ # define PR_SPEC_STORE_BYPASS 0 +# define PR_SPEC_INDIR_BRANCH 1 /* Return and control values for PR_SET/GET_SPECULATION_CTRL */ # define PR_SPEC_NOT_AFFECTED 0 # define PR_SPEC_PRCTL (1UL << 0)
IBPB currently is applied to all tasks. However, when spectre_v2_app2app_enabled is set to default value SPECTRE_V2_APP2APP_PRCTL, only tasks marked with TIF_SPEC_BRANCH_SPECULATION via prctl are protected against Spectre V2 sibling thread attack to minimize performance impact.
Extend this option to IBPB to protect only tasks marked with TIF_SPEC_BRANCH_SPECULATION needing mitigation to minimize performance impact.
Make IBPB usage follow the spectre_v2_app2app_enabled option: spectre_v2_app2app = SPECTRE_V2_APP2APP_PRCTL : Use IBPB only on tasks with TIF_SPEC_BRANCH_SPECULATION SPECTRE_V2_APP2APP_STRICT : Use IBPB on all tasks SPECTRE_V2_APP2APP_NONE : Don't use IBPB
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- arch/x86/kernel/cpu/bugs.c | 29 ++++++++++++++++++----------- arch/x86/mm/tlb.c | 23 ++++++++++++++++++----- 2 files changed, 36 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 26e1a87..44f7127 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -534,12 +534,6 @@ static void __init spectre_v2_select_mitigation(void) setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW); pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
- /* Initialize Indirect Branch Prediction Barrier if supported */ - if (boot_cpu_has(X86_FEATURE_IBPB)) { - setup_force_cpu_cap(X86_FEATURE_USE_IBPB); - pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n"); - } - /* * Retpoline means the kernel is safe because it has no indirect * branches. Enhanced IBRS protects firmware too, so, enable restricted @@ -558,8 +552,9 @@ static void __init spectre_v2_select_mitigation(void)
app2app_mode = SPECTRE_V2_APP2APP_NONE;
- /* No mitigation if mitigation feature is unavailable */ - if (!boot_cpu_has(X86_FEATURE_STIBP)) + /* No mitigation if all mitigation features are unavailable */ + if (!boot_cpu_has(X86_FEATURE_IBPB) && + !boot_cpu_has(X86_FEATURE_STIBP)) goto set_app2app_mode;
app2app_cmd = spectre_v2_parse_app2app_cmdline(cmd); @@ -587,6 +582,16 @@ static void __init spectre_v2_select_mitigation(void) break; }
+ /* + * Initialize Indirect Branch Prediction Barrier if supported + * and not disabled explicitly + */ + if (boot_cpu_has(X86_FEATURE_IBPB) && + app2app_mode != SPECTRE_V2_APP2APP_NONE) { + setup_force_cpu_cap(X86_FEATURE_USE_IBPB); + pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n"); + } + set_app2app_mode: spectre_v2_app2app_enabled = app2app_mode; pr_info("%s\n", spectre_v2_app2app_strings[app2app_mode]); @@ -1076,10 +1081,12 @@ static char *stibp_state(void)
static char *ibpb_state(void) { - if (boot_cpu_has(X86_FEATURE_USE_IBPB)) - return ", IBPB"; - else + if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_NONE) return ""; + else if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_PRCTL) + return ", IBPB-prctl"; + else + return ", IBPB-all"; }
static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index bddd6b3..616694c 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -184,14 +184,27 @@ static void sync_current_stack_to_mm(struct mm_struct *mm) static bool ibpb_needed(struct task_struct *tsk, u64 last_ctx_id) { /* - * Check if the current (previous) task has access to the memory - * of the @tsk (next) task. If access is denied, make sure to - * issue a IBPB to stop user->user Spectre-v2 attacks. + * Don't issue IBPB when switching to kernel threads or staying in the + * same mm context. + */ + if (!tsk || !tsk->mm || tsk->mm->context.ctx_id == last_ctx_id) + return false; + + /* + * If lite protection mode is enabled, check the STIBP thread flag. + * + * Otherwise check if the current (previous) task has access to the + * the memory of the @tsk (next) task for strict app to app protection. + * If access is denied, make sure to issue a IBPB to stop user->user + * Spectre-v2 attacks. * * Note: __ptrace_may_access() returns 0 or -ERRNO. */ - return (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id && - ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB)); + + if (static_branch_unlikely(&spectre_v2_app_lite)) + return test_tsk_thread_flag(tsk, TIF_SPEC_INDIR_BRANCH); + else + return ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB); }
void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
From: Jiri Kosina jikos@kernel.org
From: Jiri Kosina jkosina@suse.cz
If 'prctl' mode of app2app protection from spectre_v2 is selected on kernel command-line, we are currently applying STIBP protection to tasks that restrict their indirect branch speculation via
prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIR_BRANCH, PR_SPEC_ENABLE, 0, 0);
Let's extend this to cover also SECCOMP tasks (analogically to how we apply SSBD protection).
According to software guidance:
"Setting ... STIBP ... on a logical processor prevents the predicted targets of indirect branches on any logical processor of that core from being controlled by software that executes (or executed previously) on another logical processor of the same core."
https://software.intel.com/security-software-guidance/insights/deep-dive-sin...
Hence setting STIBP on a sandboxed task will prevent the task from attacking other sibling threads or getting attacked.
Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- Documentation/admin-guide/kernel-parameters.txt | 7 ++++++- arch/x86/include/asm/nospec-branch.h | 1 + arch/x86/kernel/cpu/bugs.c | 21 +++++++++++++++++++-- 3 files changed, 26 insertions(+), 3 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index d2255f7..89b193c 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4227,12 +4227,17 @@ and STIBP mitigations against Spectre V2 attacks. If the CPU is not vulnerable, "off" is selected. If the CPU is vulnerable, the default mitigation - is "prctl". + is architecture and Kconfig dependent. See below. prctl - Enable mitigations per thread by restricting indirect branch speculation via prctl. Mitigation for a thread is not enabled by default to avoid mitigation overhead. The state of of the control is inherited on fork. + seccomp - Same as "prctl" above, but all seccomp threads + will disable SSB unless they explicitly opt out. + + Default mitigations: + If CONFIG_SECCOMP=y "seccomp", otherwise "prctl"
Not specifying this option is equivalent to spectre_v2_app2app=auto. diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 69d2657..077ec54 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -231,6 +231,7 @@ enum spectre_v2_app2app_mitigation { SPECTRE_V2_APP2APP_NONE, SPECTRE_V2_APP2APP_STRICT, SPECTRE_V2_APP2APP_PRCTL, + SPECTRE_V2_APP2APP_SECCOMP, };
/* The Speculative Store Bypass disable variants */ diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 44f7127..f349b3f 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -138,6 +138,7 @@ enum spectre_v2_app2app_mitigation_cmd { SPECTRE_V2_APP2APP_CMD_FORCE, SPECTRE_V2_APP2APP_CMD_AUTO, SPECTRE_V2_APP2APP_CMD_PRCTL, + SPECTRE_V2_APP2APP_CMD_SECCOMP, };
static const char *spectre_v2_strings[] = { @@ -153,6 +154,7 @@ static const char *spectre_v2_app2app_strings[] = { [SPECTRE_V2_APP2APP_NONE] = "App-App Vulnerable", [SPECTRE_V2_APP2APP_STRICT] = "App-App Mitigation: Full app to app attack protection", [SPECTRE_V2_APP2APP_PRCTL] = "App-App Mitigation: Protect branch speculation restricted tasks", + [SPECTRE_V2_APP2APP_SECCOMP] = "App-App Mitigation: Protect branch speculation restricted and seccomp tasks", };
/* Lightweight mitigation: mitigate only tasks with TIF_SPEC_INDIR_BRANCH */ @@ -573,10 +575,17 @@ static void __init spectre_v2_select_mitigation(void) break;
case SPECTRE_V2_APP2APP_CMD_PRCTL: - case SPECTRE_V2_APP2APP_CMD_AUTO: app2app_mode = SPECTRE_V2_APP2APP_PRCTL; break;
+ case SPECTRE_V2_APP2APP_CMD_AUTO: + case SPECTRE_V2_APP2APP_CMD_SECCOMP: + if (IS_ENABLED(CONFIG_SECCOMP)) + app2app_mode = SPECTRE_V2_APP2APP_SECCOMP; + else + app2app_mode = SPECTRE_V2_APP2APP_PRCTL; + break; + case SPECTRE_V2_APP2APP_CMD_FORCE: app2app_mode = SPECTRE_V2_APP2APP_STRICT; break; @@ -595,7 +604,8 @@ static void __init spectre_v2_select_mitigation(void) set_app2app_mode: spectre_v2_app2app_enabled = app2app_mode; pr_info("%s\n", spectre_v2_app2app_strings[app2app_mode]); - if (app2app_mode == SPECTRE_V2_APP2APP_PRCTL) + if (app2app_mode == SPECTRE_V2_APP2APP_PRCTL || + app2app_mode == SPECTRE_V2_APP2APP_SECCOMP) static_branch_enable(&spectre_v2_app_lite);
/* Enable STIBP if appropriate */ @@ -849,6 +859,8 @@ void arch_seccomp_spec_mitigate(struct task_struct *task) { if (ssb_mode == SPEC_STORE_BYPASS_SECCOMP) ssb_prctl_set(task, PR_SPEC_FORCE_DISABLE); + if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_SECCOMP) + set_task_restrict_indir_branch(task, true); } #endif
@@ -879,6 +891,7 @@ static int indir_branch_prctl_get(struct task_struct *task) switch (spectre_v2_app2app_enabled) { case SPECTRE_V2_APP2APP_NONE: return PR_SPEC_ENABLE; + case SPECTRE_V2_APP2APP_SECCOMP: case SPECTRE_V2_APP2APP_PRCTL: if (task_spec_indir_branch_force_disable(task)) return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE; @@ -1075,6 +1088,8 @@ static char *stibp_state(void) return ""; else if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_PRCTL) return ", STIBP-prctl"; + else if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_SECCOMP) + return ", STIBP-seccomp"; else return ", STIBP-all"; } @@ -1085,6 +1100,8 @@ static char *ibpb_state(void) return ""; else if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_PRCTL) return ", IBPB-prctl"; + else if (spectre_v2_app2app_enabled == SPECTRE_V2_APP2APP_SECCOMP) + return ", IBPB-seccomp"; else return ", IBPB-all"; }
On Tue, 20 Nov 2018, Tim Chen wrote:
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index d2255f7..89b193c 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4227,12 +4227,17 @@ and STIBP mitigations against Spectre V2 attacks. If the CPU is not vulnerable, "off" is selected. If the CPU is vulnerable, the default mitigation
is "prctl".
is architecture and Kconfig dependent. See below. prctl - Enable mitigations per thread by restricting indirect branch speculation via prctl. Mitigation for a thread is not enabled by default to avoid mitigation overhead. The state of of the control is inherited on fork.
seccomp - Same as "prctl" above, but all seccomp threads
will disable SSB unless they explicitly opt out.
As Dave already pointed out elsewhere -- the "SSB" here is probably a copy/paste error. It should read something along the lines of "... will restrict indirect branch speculation ..."
Thanks,
On 11/20/2018 04:44 PM, Jiri Kosina wrote:
On Tue, 20 Nov 2018, Tim Chen wrote:
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index d2255f7..89b193c 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4227,12 +4227,17 @@ and STIBP mitigations against Spectre V2 attacks. If the CPU is not vulnerable, "off" is selected. If the CPU is vulnerable, the default mitigation
is "prctl".
is architecture and Kconfig dependent. See below. prctl - Enable mitigations per thread by restricting indirect branch speculation via prctl. Mitigation for a thread is not enabled by default to avoid mitigation overhead. The state of of the control is inherited on fork.
seccomp - Same as "prctl" above, but all seccomp threads
will disable SSB unless they explicitly opt out.
As Dave already pointed out elsewhere -- the "SSB" here is probably a copy/paste error. It should read something along the lines of "... will restrict indirect branch speculation ..."
Thanks. Should have caught it.
Tim
When a task is made non-dumpable, a higher level of security is implied implicitly as its memory is imposed with access restriction. Many daemons touching sensitive data (e.g. sshd) make theselves non-dumpable. Such tasks should have speculative execution restricted to protect them from attacks taking advantage of CPU speculation side channels.
Add calls to arch_update_spec_restiction() to put speculative restriction on a task when changing its dumpability. Restrict speculative execution on a non-dumpable task and relax the restrictions on a dumpable task.
A change to dumpability occurs via setgid, setuid, or prctl(SUID_SET_DUMPABLE) syscalls. The user should expect associated change in speculative restriction occurs only on the task that issued such syscall. Speculative restriction changes are not extended to other threads in the same process. This should not be a problem as such changes should be made before spawning additional threads.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- fs/exec.c | 3 +++ include/linux/cpu.h | 3 +++ kernel/cpu.c | 5 +++++ kernel/cred.c | 5 ++++- kernel/sys.c | 7 +++++++ 5 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/fs/exec.c b/fs/exec.c index fc281b7..d72e20d 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -62,6 +62,7 @@ #include <linux/oom.h> #include <linux/compat.h> #include <linux/vmalloc.h> +#include <linux/cpu.h>
#include <linux/uaccess.h> #include <asm/mmu_context.h> @@ -1366,6 +1367,8 @@ void setup_new_exec(struct linux_binprm * bprm) else set_dumpable(current->mm, SUID_DUMP_USER);
+ arch_update_spec_restriction(current); + arch_setup_new_exec(); perf_event_exec(); __set_task_comm(current, kbasename(bprm->filename), true); diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 6f43024..4fef90a 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -187,4 +187,7 @@ static inline void cpu_smt_check_topology(void) { } DECLARE_STATIC_KEY_TRUE(cpu_smt_enabled); #endif
+/* Update CPU's speculation restrictions on a task based on task's properties */ +extern int arch_update_spec_restriction(struct task_struct *task); + #endif /* _LINUX_CPU_H_ */ diff --git a/kernel/cpu.c b/kernel/cpu.c index f846416..fe93a8a 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -2291,6 +2291,11 @@ void init_cpu_online(const struct cpumask *src) cpumask_copy(&__cpu_online_mask, src); }
+int __weak arch_update_spec_restriction(struct task_struct *task) +{ + return 0; +} + /* * Activate the first processor. */ diff --git a/kernel/cred.c b/kernel/cred.c index ecf0365..bc47653 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -19,6 +19,7 @@ #include <linux/security.h> #include <linux/binfmts.h> #include <linux/cn_proc.h> +#include <linux/cpu.h>
#if 0 #define kdebug(FMT, ...) \ @@ -445,8 +446,10 @@ int commit_creds(struct cred *new) !uid_eq(old->fsuid, new->fsuid) || !gid_eq(old->fsgid, new->fsgid) || !cred_cap_issubset(old, new)) { - if (task->mm) + if (task->mm) { set_dumpable(task->mm, suid_dumpable); + arch_update_spec_restriction(task); + } task->pdeath_signal = 0; smp_wmb(); } diff --git a/kernel/sys.c b/kernel/sys.c index 123bd73..621ea94 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2290,6 +2290,13 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, break; } set_dumpable(me->mm, arg2); + /* + * Any speculative execution restriction updates + * associated with change in dumpability + * applies only to the current task that issues + * the request. + */ + arch_update_spec_restriction(me); break;
case PR_SET_UNALIGN:
When a task changes its dumpability, arch_update_spec_ctrl_restriction() is called to place restriction on the task's speculative execution according to dumpability changes.
Implements arch_update_spec_restriction() for x86. Use STIBP to restrict speculative execution when running a task set to non-dumpable, or clear the restriction if the task is set to dumpable.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- Documentation/admin-guide/kernel-parameters.txt | 3 ++- arch/x86/kernel/cpu/bugs.c | 23 ++++++++++++++++++++--- 2 files changed, 22 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 89b193c..3979b12 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4229,7 +4229,8 @@ If the CPU is vulnerable, the default mitigation is architecture and Kconfig dependent. See below. prctl - Enable mitigations per thread by restricting - indirect branch speculation via prctl. + indirect branch speculation via prctl or setting + the thread as non-dumpable. Mitigation for a thread is not enabled by default to avoid mitigation overhead. The state of of the control is inherited on fork. diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index f349b3f..6cd64445 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -14,6 +14,7 @@ #include <linux/module.h> #include <linux/nospec.h> #include <linux/prctl.h> +#include <linux/coredump.h>
#include <asm/spec-ctrl.h> #include <asm/cmdline.h> @@ -153,8 +154,8 @@ static const char *spectre_v2_strings[] = { static const char *spectre_v2_app2app_strings[] = { [SPECTRE_V2_APP2APP_NONE] = "App-App Vulnerable", [SPECTRE_V2_APP2APP_STRICT] = "App-App Mitigation: Full app to app attack protection", - [SPECTRE_V2_APP2APP_PRCTL] = "App-App Mitigation: Protect branch speculation restricted tasks", - [SPECTRE_V2_APP2APP_SECCOMP] = "App-App Mitigation: Protect branch speculation restricted and seccomp tasks", + [SPECTRE_V2_APP2APP_PRCTL] = "App-App Mitigation: Protect non-dumpable and branch speculation restricted tasks", + [SPECTRE_V2_APP2APP_SECCOMP] = "App-App Mitigation: Protect non-dumpable, branch speculation restricted and seccomp tasks", };
/* Lightweight mitigation: mitigate only tasks with TIF_SPEC_INDIR_BRANCH */ @@ -792,13 +793,29 @@ static void set_task_restrict_indir_branch(struct task_struct *tsk, bool restric
if (restrict_on) update = !test_and_set_tsk_thread_flag(tsk, TIF_SPEC_INDIR_BRANCH); - else + else if (!task_spec_indir_branch_disable(tsk)) update = test_and_clear_tsk_thread_flag(tsk, TIF_SPEC_INDIR_BRANCH);
if (tsk == current && update) speculation_ctrl_update_current(); }
+int arch_update_spec_restriction(struct task_struct *task) +{ + if (!static_branch_unlikely(&spectre_v2_app_lite)) + return 0; + + if (!task->mm) + return -EINVAL; + + if (get_dumpable(task->mm) != SUID_DUMP_USER) + set_task_restrict_indir_branch(task, true); + else + set_task_restrict_indir_branch(task, false); + + return 0; +} + static int indir_branch_prctl_set(struct task_struct *task, unsigned long ctrl) { switch (ctrl) {
On Tue, Nov 20, 2018 at 4:33 PM Tim Chen tim.c.chen@linux.intel.com wrote:
Implements arch_update_spec_restriction() for x86. Use STIBP to restrict speculative execution when running a task set to non-dumpable, or clear the restriction if the task is set to dumpable.
I don't think this necessarily makes sense.
The new "auto" behavior is that we aim to restrict untrusted code (and the loader of such code uses prctrl to set that flag), then this whole "set STIBP for non-dumpable" makes little sense.
A non-dumpable app by definition is *more* trusted, not less trusted.
So this model of "let's disable prediction for system processes" not only doesn't make sense, but it also unnecessarily penalizes those potentially very important system processes.
Also, "dumpable" in general is pretty oddly defined to be used for this.
The same (privileged) process can be dumpable or not depending on how it was started (ie if it was started by a regular user and became trusted through suid, it's not dumpable, but if it was started from a root process it remains dumpable.
So I'm just not convinced "dumpability" is meaningful for STIBP.
Linus
On Tue, 20 Nov 2018, Linus Torvalds wrote:
Implements arch_update_spec_restriction() for x86. Use STIBP to restrict speculative execution when running a task set to non-dumpable, or clear the restriction if the task is set to dumpable.
I don't think this necessarily makes sense.
The new "auto" behavior is that we aim to restrict untrusted code (and the loader of such code uses prctrl to set that flag), then this whole "set STIBP for non-dumpable" makes little sense.
A non-dumpable app by definition is *more* trusted, not less trusted.
I understand your argument. I believe actually both ways of protection do make sense in some way (but it doesn't mean we should do it by default). Basically:
- process marks itself "I am loading untrusted code" via that prctl() in order to avoid its untrusted code to be used as spectrev2 gadgets
- process marks itself "I am loading untrusted code" via that prctl() in order to have its all threads/subprocesses marked the same way, so that one thread can't influence speculative code flow of the other in order to read its memory (the "javascript in one browser tab reads secrets from another tab")
- non-dumpable tasks have the branch predictor flushed when context switching to them (IBPB) or when sibling is running untrusted code (STIBP) in order not guarantee that its speculative code flow can never be influenced by previous / sibling process mistraining branch predictor for it, and therefore do not allow reading its secrets from memory through gadgets that'd have to be in the process code itself
But I agree there are many reasons why this shouldn't be done by default if we accept 'prctl' as the default mode. Namely:
- the whole "proper gadgets need to be present in the process' .text" is dubious by itself
- the unavoidable overhead it'd impose on network daemons that you can't really get rid of
The distiled patchset that Thomas will be sending around today is not have the dumpability restriction in it.
Thanks,
On 11/20/2018 05:27 PM, Linus Torvalds wrote:
On Tue, Nov 20, 2018 at 4:33 PM Tim Chen tim.c.chen@linux.intel.com wrote:
Implements arch_update_spec_restriction() for x86. Use STIBP to restrict speculative execution when running a task set to non-dumpable, or clear the restriction if the task is set to dumpable.
I don't think this necessarily makes sense.
The new "auto" behavior is that we aim to restrict untrusted code (and the loader of such code uses prctrl to set that flag), then this whole "set STIBP for non-dumpable" makes little sense.
When STIBP is on, it will prevent not only untrusted code from attacking, but also trusted code from getting attacked. So non-dumpable task running with STIBP will protect itself from attacks from code running on sibling CPU.
From software guidance:
"Setting ... STIBP ... on a logical processor prevents the predicted targets of indirect branches on any logical processor of that core from being controlled by software that executes (or executed previously) on another logical processor of the same core."
The intention was to put TIF_SPEC_INDIR_BRANCH flag on non-dumpable task, so it runs with STIBP and prevent itself from getting attacked from code running in sibling CPU. And when we context switch to non-dumpable task, IBPB will be issued to prevent attack from anything running on the same cpu based on TIF_SPEC_INDIR_BRANCH.
A non-dumpable app by definition is *more* trusted, not less trusted.
So this model of "let's disable prediction for system processes" not only doesn't make sense, but it also unnecessarily penalizes those potentially very important system processes.
It is a trade off of extra protection for non-dumpable app with extra overhead. :(
Here it is the default behavior but that can be changed.
If we don't erect STIBP for non-dumpable tasks as default, we should still do IBPB before switching to them. So the STIBP behavior and IBPB behavior will then be untied for non-dumpable default.
Also, "dumpable" in general is pretty oddly defined to be used for this.
The same (privileged) process can be dumpable or not depending on how it was started (ie if it was started by a regular user and became trusted through suid, it's not dumpable, but if it was started from a root process it remains dumpable.
So I'm just not convinced "dumpability" is meaningful for STIBP.
Tim
On Wed, Nov 21, 2018 at 9:41 AM Tim Chen tim.c.chen@linux.intel.com wrote:
When STIBP is on, it will prevent not only untrusted code from attacking, but also trusted code from getting attacked. So non-dumpable task running with STIBP will protect itself from attacks from code running on sibling CPU.
I understand.
You didn't read my email about why "dumpable" is not sensible.
Linus
On 11/20/18 5:27 PM, Linus Torvalds wrote:
Also, "dumpable" in general is pretty oddly defined to be used for this.
The same (privileged) process can be dumpable or not depending on how it was started (ie if it was started by a regular user and became trusted through suid, it's not dumpable, but if it was started from a root process it remains dumpable.
So I'm just not convinced "dumpability" is meaningful for STIBP.
I think we're hoping that "dumpability" is at least correlated with sensitive processes. As you've pointed out, it's not a strict relationship, but there's still some meaning.
Let's not forget about things like gpg that do PR_SET_DUMPABLE completely independently of the actions that trigger the /proc/sys/fs/suid_dumpable behavior. Those will be non-dumpable regardless of how they were started.
In addition, things that are started via suid surely *do* have more attack surface than something started by root. We've been positing that these attacks get easier when the attacker and victim have a relationship, either via RPC, or the network, or *something*. suid basically *guarantees* there's a relationship between the privileged thing and _something_ untrusted.
Repurposing dumpable is really screwy and surely imprecise, but it really is the closest thing that we have without the new ABI.
On Wed, Nov 21, 2018 at 12:07 PM Dave Hansen dave.hansen@intel.com wrote:
Repurposing dumpable is really screwy and surely imprecise, but it really is the closest thing that we have without the new ABI.
But we *have* a new ABI.
So that's not a valid argument.
It's more like "this other thing that some other users use for something *entirely* different has in _one_ case the semantics you'd want, but in most cases not at all".
Because gpg really is the odd man out.
And it's not at all obvious that you can attack gpg using the hole that STIBP opens, when there are other timing attacks that are likely as good or better, and when we know that people who really care about the issue are already just disabling SMT entirely.
That's really the basic issue here: STIBP has horrible overhead, _and_ it's not even targeting the people who really want it, so we'd better be very targeted in how it's used.
Because we already know how badly things messed up when the use of STIBP wasn't targeted.
The _only_ very real and direct advantage "dumpable" has is that it hides the problem from benchmarks. Because benchmarks don't test non-dumpable processes.
But honestly, that sounds like a disadvantage to me. It smells like "let's hide the overhead dishonestly".
Linus
From: Peter Zijlstra peterz@infradead.org
From: Peter Zijlstra (Intel) peterz@infradead.org
Currently the sched_smt_present static key is only enabled when we encounter SMT topology. However there is demand to also disable the key when the topology changes such that there is no SMT present anymore.
Implement this by making the key count the number of cores that have SMT enabled.
In particular, the SMT topology bits are set before we enable interrrupts and similarly, are cleared after we disable interrupts for the last time and die.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- kernel/sched/core.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 091e089..6fedf3a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5738,15 +5738,10 @@ int sched_cpu_activate(unsigned int cpu)
#ifdef CONFIG_SCHED_SMT /* - * The sched_smt_present static key needs to be evaluated on every - * hotplug event because at boot time SMT might be disabled when - * the number of booted CPUs is limited. - * - * If then later a sibling gets hotplugged, then the key would stay - * off and SMT scheduling would never be functional. + * When going up, increment the number of cores with SMT present. */ - if (cpumask_weight(cpu_smt_mask(cpu)) > 1) - static_branch_enable_cpuslocked(&sched_smt_present); + if (cpumask_weight(cpu_smt_mask(cpu)) == 2) + static_branch_inc_cpuslocked(&sched_smt_present); #endif set_cpu_active(cpu, true);
@@ -5790,6 +5785,14 @@ int sched_cpu_deactivate(unsigned int cpu) */ synchronize_rcu_mult(call_rcu, call_rcu_sched);
+#ifdef CONFIG_SCHED_SMT + /* + * When going down, decrement the number of cores with SMT present. + */ + if (cpumask_weight(cpu_smt_mask(cpu)) == 2) + static_branch_dec_cpuslocked(&sched_smt_present); +#endif + if (!sched_smp_initialized) return 0;
Currently cpu_use_smt_and_hotplug is only set during boot time to indicate if SMT is in use.
However, CPU topology may change and when the last SMT thread is offlined, the SMT code path can be skipped. The sched_smt_present key detects this condition.
Export sched_smt_present and incorporate it into cpu_use_smt_and_hotplug to disable SMT code when there are no paired siblings.
Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- include/linux/cpu.h | 12 ++++++++++++ kernel/sched/sched.h | 2 -- 2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 4fef90a..2fc649d 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -100,6 +100,10 @@ static inline void cpu_maps_update_done(void) #endif /* CONFIG_SMP */ extern struct bus_type cpu_subsys;
+#ifdef CONFIG_SCHED_SMT +extern struct static_key_false sched_smt_present; +#endif + #ifdef CONFIG_HOTPLUG_CPU extern void cpus_write_lock(void); extern void cpus_write_unlock(void); @@ -172,7 +176,15 @@ static inline void cpuhp_report_idle_dead(void) { }
#if defined(CONFIG_SMP) && defined(CONFIG_HOTPLUG_SMT) DECLARE_STATIC_KEY_TRUE(cpu_smt_enabled); + +#ifdef CONFIG_SCHED_SMT +#define cpu_use_smt_and_hotplug \ + (static_branch_likely(&cpu_smt_enabled) && \ + static_branch_unlikely(&sched_smt_present)) +#else #define cpu_use_smt_and_hotplug (static_branch_likely(&cpu_smt_enabled)) +#endif + extern void cpu_smt_disable(bool force); extern void cpu_smt_check_topology_early(void); extern void cpu_smt_check_topology(void); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 618577f..e1e3f09 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -937,8 +937,6 @@ static inline int cpu_of(struct rq *rq)
#ifdef CONFIG_SCHED_SMT
-extern struct static_key_false sched_smt_present; - extern void __update_idle_core(struct rq *rq);
static inline void update_idle_core(struct rq *rq)
On 11/20/2018 03:59 PM, Tim Chen wrote:
Fix in this version bugs causing build problems for UP configuration.
Also merged in Jiri's change to extend STIBP for SECCOMP processes and renaming TIF_STIBP to TIF_SPEC_INDIR_BRANCH.
I've updated the boot options spectre_v2_app2app to on, off, auto, prctl and seccomp. This aligns with the options for other speculation related mitigations.
I tried to incorporate sched_smt_present to detect when we have all SMT going offline and we can disable the SMT path, which Peter suggested. This optimization that can be easily left out of the patch series and not backported. I've put these two patches at the end and they can be considered separately.
I've dropped the TIF flags re-organization patches as they are not needed in this patch series.
To do: Create a dedicated document on the mitigation options for Spectre V2.
My apology that the v6 patch series is missing the first two patches in the series. Please ignore v6 and resending the patch series as v7.
Tim
linux-stable-mirror@lists.linaro.org