This is the backport for Indirect Target Selection(ITS) mitigation for 5.10. This is only boot tested, so sending it as an RFC for now. I hope some bot picks this up for some at-scale testing. Meanwhile I am doing basic tests around ITS mitigation.
In addition to commits in 5.15 ITS backport, below commits are required to make the ITS mitigation work on 5.10. These are the prime target of scrutiny:
x86/alternatives: Teach text_poke_bp() to patch Jcc.d32 instructions x86/alternatives: Introduce int3_emulate_jcc() x86/bhi: Define SPEC_CTRL_BHI_DIS_S
--- Borislav Petkov (AMD) (1): x86/alternative: Optimize returns patching
Daniel Sneddon (1): x86/bhi: Define SPEC_CTRL_BHI_DIS_S
Eric Biggers (1): x86/its: Fix build errors when CONFIG_MODULES=n
Josh Poimboeuf (1): x86/alternatives: Remove faulty optimization
Pawan Gupta (7): Documentation: x86/bugs/its: Add ITS documentation x86/its: Enumerate Indirect Target Selection (ITS) bug x86/its: Add support for ITS-safe indirect thunk x86/its: Add support for ITS-safe return thunk x86/its: Fix undefined reference to cpu_wants_rethunk_at() x86/its: Enable Indirect Target Selection mitigation x86/its: Add "vmexit" option to skip mitigation on some CPUs
Peter Zijlstra (4): x86/alternatives: Introduce int3_emulate_jcc() x86/alternatives: Teach text_poke_bp() to patch Jcc.d32 instructions x86/its: Use dynamic thunks for indirect branches x86/its: FineIBT-paranoid vs ITS
Thomas Gleixner (1): x86/modules: Set VM_FLUSH_RESET_PERMS in module_alloc()
Documentation/ABI/testing/sysfs-devices-system-cpu | 1 + Documentation/admin-guide/hw-vuln/index.rst | 1 + .../hw-vuln/indirect-target-selection.rst | 156 +++++++++++ Documentation/admin-guide/kernel-parameters.txt | 15 + arch/x86/Kconfig | 11 + arch/x86/include/asm/alternative.h | 26 ++ arch/x86/include/asm/cpufeatures.h | 6 +- arch/x86/include/asm/msr-index.h | 13 +- arch/x86/include/asm/nospec-branch.h | 11 + arch/x86/include/asm/text-patching.h | 31 +++ arch/x86/kernel/alternative.c | 308 ++++++++++++++++++++- arch/x86/kernel/cpu/bugs.c | 139 +++++++++- arch/x86/kernel/cpu/common.c | 63 ++++- arch/x86/kernel/cpu/scattered.c | 1 + arch/x86/kernel/ftrace.c | 4 +- arch/x86/kernel/kprobes/core.c | 39 +-- arch/x86/kernel/module.c | 14 +- arch/x86/kernel/static_call.c | 2 +- arch/x86/kernel/vmlinux.lds.S | 8 + arch/x86/kvm/x86.c | 4 +- arch/x86/lib/retpoline.S | 39 +++ arch/x86/net/bpf_jit_comp.c | 8 +- drivers/base/cpu.c | 8 + include/linux/cpu.h | 2 + include/linux/module.h | 5 + 25 files changed, 842 insertions(+), 73 deletions(-) --- base-commit: 01e7e36b8606e5d4fddf795938010f7bfa3aa277 change-id: 20250609-its-5-10-a656ce2e08ce
Best regards,
commit 1ac116ce6468670eeda39345a5585df308243dca upstream.
Add the admin-guide for Indirect Target Selection (ITS).
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/hw-vuln/index.rst | 1 + .../hw-vuln/indirect-target-selection.rst | 156 +++++++++++++++++++++ 2 files changed, 157 insertions(+)
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst index e020d1637e1c4880099eee517fa19fa611794255..04a7f9fea3f21a740c95c1b3dbd37d62c8ac5bf0 100644 --- a/Documentation/admin-guide/hw-vuln/index.rst +++ b/Documentation/admin-guide/hw-vuln/index.rst @@ -19,3 +19,4 @@ are configurable at compile, boot or run time. gather_data_sampling.rst srso reg-file-data-sampling + indirect-target-selection diff --git a/Documentation/admin-guide/hw-vuln/indirect-target-selection.rst b/Documentation/admin-guide/hw-vuln/indirect-target-selection.rst new file mode 100644 index 0000000000000000000000000000000000000000..4788e14ebce09aa715141c372188c37435f73dc1 --- /dev/null +++ b/Documentation/admin-guide/hw-vuln/indirect-target-selection.rst @@ -0,0 +1,156 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Indirect Target Selection (ITS) +=============================== + +ITS is a vulnerability in some Intel CPUs that support Enhanced IBRS and were +released before Alder Lake. ITS may allow an attacker to control the prediction +of indirect branches and RETs located in the lower half of a cacheline. + +ITS is assigned CVE-2024-28956 with a CVSS score of 4.7 (Medium). + +Scope of Impact +--------------- +- **eIBRS Guest/Host Isolation**: Indirect branches in KVM/kernel may still be + predicted with unintended target corresponding to a branch in the guest. + +- **Intra-Mode BTI**: In-kernel training such as through cBPF or other native + gadgets. + +- **Indirect Branch Prediction Barrier (IBPB)**: After an IBPB, indirect + branches may still be predicted with targets corresponding to direct branches + executed prior to the IBPB. This is fixed by the IPU 2025.1 microcode, which + should be available via distro updates. Alternatively microcode can be + obtained from Intel's github repository [#f1]_. + +Affected CPUs +------------- +Below is the list of ITS affected CPUs [#f2]_ [#f3]_: + + ======================== ============ ==================== =============== + Common name Family_Model eIBRS Intra-mode BTI + Guest/Host Isolation + ======================== ============ ==================== =============== + SKYLAKE_X (step >= 6) 06_55H Affected Affected + ICELAKE_X 06_6AH Not affected Affected + ICELAKE_D 06_6CH Not affected Affected + ICELAKE_L 06_7EH Not affected Affected + TIGERLAKE_L 06_8CH Not affected Affected + TIGERLAKE 06_8DH Not affected Affected + KABYLAKE_L (step >= 12) 06_8EH Affected Affected + KABYLAKE (step >= 13) 06_9EH Affected Affected + COMETLAKE 06_A5H Affected Affected + COMETLAKE_L 06_A6H Affected Affected + ROCKETLAKE 06_A7H Not affected Affected + ======================== ============ ==================== =============== + +- All affected CPUs enumerate Enhanced IBRS feature. +- IBPB isolation is affected on all ITS affected CPUs, and need a microcode + update for mitigation. +- None of the affected CPUs enumerate BHI_CTRL which was introduced in Golden + Cove (Alder Lake and Sapphire Rapids). This can help guests to determine the + host's affected status. +- Intel Atom CPUs are not affected by ITS. + +Mitigation +---------- +As only the indirect branches and RETs that have their last byte of instruction +in the lower half of the cacheline are vulnerable to ITS, the basic idea behind +the mitigation is to not allow indirect branches in the lower half. + +This is achieved by relying on existing retpoline support in the kernel, and in +compilers. ITS-vulnerable retpoline sites are runtime patched to point to newly +added ITS-safe thunks. These safe thunks consists of indirect branch in the +second half of the cacheline. Not all retpoline sites are patched to thunks, if +a retpoline site is evaluated to be ITS-safe, it is replaced with an inline +indirect branch. + +Dynamic thunks +~~~~~~~~~~~~~~ +From a dynamically allocated pool of safe-thunks, each vulnerable site is +replaced with a new thunk, such that they get a unique address. This could +improve the branch prediction accuracy. Also, it is a defense-in-depth measure +against aliasing. + +Note, for simplicity, indirect branches in eBPF programs are always replaced +with a jump to a static thunk in __x86_indirect_its_thunk_array. If required, +in future this can be changed to use dynamic thunks. + +All vulnerable RETs are replaced with a static thunk, they do not use dynamic +thunks. This is because RETs get their prediction from RSB mostly that does not +depend on source address. RETs that underflow RSB may benefit from dynamic +thunks. But, RETs significantly outnumber indirect branches, and any benefit +from a unique source address could be outweighed by the increased icache +footprint and iTLB pressure. + +Retpoline +~~~~~~~~~ +Retpoline sequence also mitigates ITS-unsafe indirect branches. For this +reason, when retpoline is enabled, ITS mitigation only relocates the RETs to +safe thunks. Unless user requested the RSB-stuffing mitigation. + +Mitigation in guests +^^^^^^^^^^^^^^^^^^^^ +All guests deploy ITS mitigation by default, irrespective of eIBRS enumeration +and Family/Model of the guest. This is because eIBRS feature could be hidden +from a guest. One exception to this is when a guest enumerates BHI_DIS_S, which +indicates that the guest is running on an unaffected host. + +To prevent guests from unnecessarily deploying the mitigation on unaffected +platforms, Intel has defined ITS_NO bit(62) in MSR IA32_ARCH_CAPABILITIES. When +a guest sees this bit set, it should not enumerate the ITS bug. Note, this bit +is not set by any hardware, but is **intended for VMMs to synthesize** it for +guests as per the host's affected status. + +Mitigation options +^^^^^^^^^^^^^^^^^^ +The ITS mitigation can be controlled using the "indirect_target_selection" +kernel parameter. The available options are: + + ======== =================================================================== + on (default) Deploy the "Aligned branch/return thunks" mitigation. + If spectre_v2 mitigation enables retpoline, aligned-thunks are only + deployed for the affected RET instructions. Retpoline mitigates + indirect branches. + + off Disable ITS mitigation. + + vmexit Equivalent to "=on" if the CPU is affected by guest/host isolation + part of ITS. Otherwise, mitigation is not deployed. This option is + useful when host userspace is not in the threat model, and only + attacks from guest to host are considered. + + force Force the ITS bug and deploy the default mitigation. + ======== =================================================================== + +Sysfs reporting +--------------- + +The sysfs file showing ITS mitigation status is: + + /sys/devices/system/cpu/vulnerabilities/indirect_target_selection + +Note, microcode mitigation status is not reported in this file. + +The possible values in this file are: + +.. list-table:: + + * - Not affected + - The processor is not vulnerable. + * - Vulnerable + - System is vulnerable and no mitigation has been applied. + * - Vulnerable, KVM: Not affected + - System is vulnerable to intra-mode BTI, but not affected by eIBRS + guest/host isolation. + * - Mitigation: Aligned branch/return thunks + - The mitigation is enabled, affected indirect branches and RETs are + relocated to safe thunks. + +References +---------- +.. [#f1] Microcode repository - https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files + +.. [#f2] Affected Processors list - https://www.intel.com/content/www/us/en/developer/topic-technology/software-... + +.. [#f3] Affected Processors list (machine readable) - https://github.com/intel/Intel-affected-processor-list
From: Daniel Sneddon daniel.sneddon@linux.intel.com
commit 0f4a837615ff925ba62648d280a861adf1582df7 upstream.
Newer processors supports a hardware control BHI_DIS_S to mitigate Branch History Injection (BHI). Setting BHI_DIS_S protects the kernel from userspace BHI attacks without having to manually overwrite the branch history.
Define MSR_SPEC_CTRL bit BHI_DIS_S and its enumeration CPUID.BHI_CTRL. Mitigation is enabled later.
Signed-off-by: Daniel Sneddon daniel.sneddon@linux.intel.com Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Daniel Sneddon daniel.sneddon@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Daniel Sneddon daniel.sneddon@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/cpufeatures.h | 2 +- arch/x86/include/asm/msr-index.h | 5 ++++- arch/x86/kernel/cpu/scattered.c | 1 + 3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index f3365ec973763bbfe94c335ee9811f93272b0c80..52810a7f6b115c6d702976ee0fe38cab5c49ae4a 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -289,7 +289,7 @@ #define X86_FEATURE_FENCE_SWAPGS_KERNEL (11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */ #define X86_FEATURE_SPLIT_LOCK_DETECT (11*32+ 6) /* #AC for split lock */ #define X86_FEATURE_PER_THREAD_MBA (11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */ -/* FREE! (11*32+ 8) */ +#define X86_FEATURE_BHI_CTRL (11*32+ 8) /* "" BHI_DIS_S HW control available */ /* FREE! (11*32+ 9) */ #define X86_FEATURE_ENTRY_IBPB (11*32+10) /* "" Issue an IBPB on kernel entry */ #define X86_FEATURE_RRSBA_CTRL (11*32+11) /* "" RET prediction control */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 7fd03f4ff9ed29d6e8deee29931ff0453142b902..d6a1ad1ee86e7c871a5c1959b4faaf2fd398034f 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -55,10 +55,13 @@ #define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */ #define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior */ #define SPEC_CTRL_RRSBA_DIS_S BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT) +#define SPEC_CTRL_BHI_DIS_S_SHIFT 10 /* Disable Branch History Injection behavior */ +#define SPEC_CTRL_BHI_DIS_S BIT(SPEC_CTRL_BHI_DIS_S_SHIFT)
/* A mask for bits which the kernel toggles when controlling mitigations */ #define SPEC_CTRL_MITIGATIONS_MASK (SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD \ - | SPEC_CTRL_RRSBA_DIS_S) + | SPEC_CTRL_RRSBA_DIS_S \ + | SPEC_CTRL_BHI_DIS_S)
#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */ #define PRED_CMD_IBPB BIT(0) /* Indirect Branch Prediction Barrier */ diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index f1cd1b6fb99ef5696c71a47304eff45c34597f91..53a9a55dc0866ba10f5a97f5b7fcc9a728be83e2 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -27,6 +27,7 @@ static const struct cpuid_bit cpuid_bits[] = { { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x00000006, 0 }, { X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 }, { X86_FEATURE_RRSBA_CTRL, CPUID_EDX, 2, 0x00000007, 2 }, + { X86_FEATURE_BHI_CTRL, CPUID_EDX, 4, 0x00000007, 2 }, { X86_FEATURE_CQM_LLC, CPUID_EDX, 1, 0x0000000f, 0 }, { X86_FEATURE_CQM_OCCUP_LLC, CPUID_EDX, 0, 0x0000000f, 1 }, { X86_FEATURE_CQM_MBM_TOTAL, CPUID_EDX, 1, 0x0000000f, 1 },
commit 159013a7ca18c271ff64192deb62a689b622d860 upstream.
ITS bug in some pre-Alderlake Intel CPUs may allow indirect branches in the first half of a cache line get predicted to a target of a branch located in the second half of the cache line.
Set X86_BUG_ITS on affected CPUs. Mitigation to follow in later commits.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 8 ++++++ arch/x86/kernel/cpu/common.c | 58 ++++++++++++++++++++++++++++++-------- arch/x86/kvm/x86.c | 4 ++- 4 files changed, 58 insertions(+), 13 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 52810a7f6b115c6d702976ee0fe38cab5c49ae4a..b9aefb75eafff39ae0eaf2fca5de07e96e864234 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -459,4 +459,5 @@ #define X86_BUG_RFDS X86_BUG(1*32 + 2) /* CPU is vulnerable to Register File Data Sampling */ #define X86_BUG_BHI X86_BUG(1*32 + 3) /* CPU is affected by Branch History Injection */ #define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */ +#define X86_BUG_ITS X86_BUG(1*32 + 5) /* CPU is affected by Indirect Target Selection */ #endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index d6a1ad1ee86e7c871a5c1959b4faaf2fd398034f..a479530e59abe0e37634b8f81eec2fe70f60d902 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -179,6 +179,14 @@ * VERW clears CPU Register * File. */ +#define ARCH_CAP_ITS_NO BIT_ULL(62) /* + * Not susceptible to + * Indirect Target Selection. + * This bit is not set by + * HW, but is synthesized by + * VMMs for guests to know + * their affected status. + */
#define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /* diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 840fdffec850b5f3c67e6e5b69a0eb0e1169f2c7..a5f4f8e637713569f2671dbddaf8203415560256 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1134,6 +1134,8 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { #define GDS BIT(6) /* CPU is affected by Register File Data Sampling */ #define RFDS BIT(7) +/* CPU is affected by Indirect Target Selection */ +#define ITS BIT(8)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS), @@ -1145,22 +1147,25 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(BROADWELL_G, X86_STEPPING_ANY, SRBDS), VULNBL_INTEL_STEPPINGS(BROADWELL_X, X86_STEPPING_ANY, MMIO), VULNBL_INTEL_STEPPINGS(BROADWELL, X86_STEPPING_ANY, SRBDS), - VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED | GDS), + VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPINGS(0x0, 0x5), MMIO | RETBLEED | GDS), + VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | ITS), VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS), VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS), - VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS), - VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS), + VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x0, 0xb), MMIO | RETBLEED | GDS | SRBDS), + VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS | ITS), + VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0xc), MMIO | RETBLEED | GDS | SRBDS), + VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS | ITS), VULNBL_INTEL_STEPPINGS(CANNONLAKE_L, X86_STEPPING_ANY, RETBLEED), - VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS), - VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS), - VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS), - VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS), - VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED), - VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS), - VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS), - VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS), + VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS | ITS), + VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS | ITS), + VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED | ITS), + VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS | ITS), + VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS | ITS), VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED), - VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS), + VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | ITS), VULNBL_INTEL_STEPPINGS(ALDERLAKE, X86_STEPPING_ANY, RFDS), VULNBL_INTEL_STEPPINGS(ALDERLAKE_L, X86_STEPPING_ANY, RFDS), VULNBL_INTEL_STEPPINGS(RAPTORLAKE, X86_STEPPING_ANY, RFDS), @@ -1224,6 +1229,32 @@ static bool __init vulnerable_to_rfds(u64 ia32_cap) return cpu_matches(cpu_vuln_blacklist, RFDS); }
+static bool __init vulnerable_to_its(u64 x86_arch_cap_msr) +{ + /* The "immunity" bit trumps everything else: */ + if (x86_arch_cap_msr & ARCH_CAP_ITS_NO) + return false; + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + return false; + + /* None of the affected CPUs have BHI_CTRL */ + if (boot_cpu_has(X86_FEATURE_BHI_CTRL)) + return false; + + /* + * If a VMM did not expose ITS_NO, assume that a guest could + * be running on a vulnerable hardware or may migrate to such + * hardware. + */ + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) + return true; + + if (cpu_matches(cpu_vuln_blacklist, ITS)) + return true; + + return false; +} + static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) { u64 ia32_cap = x86_read_arch_cap_msr(); @@ -1338,6 +1369,9 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET)) setup_force_cpu_bug(X86_BUG_IBPB_NO_RET);
+ if (vulnerable_to_its(ia32_cap)) + setup_force_cpu_bug(X86_BUG_ITS); + if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bc295439360e5a844bed2d0f6345763455180b1d..b61f697479a37f78d854cf51b354cece8f313671 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1390,7 +1390,7 @@ static unsigned int num_msr_based_features; ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \ ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \ ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO | \ - ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR) + ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_ITS_NO)
static u64 kvm_get_arch_capabilities(void) { @@ -1429,6 +1429,8 @@ static u64 kvm_get_arch_capabilities(void) data |= ARCH_CAP_MDS_NO; if (!boot_cpu_has_bug(X86_BUG_RFDS)) data |= ARCH_CAP_RFDS_NO; + if (!boot_cpu_has_bug(X86_BUG_ITS)) + data |= ARCH_CAP_ITS_NO;
if (!boot_cpu_has(X86_FEATURE_RTM)) { /*
From: Peter Zijlstra peterz@infradead.org
commit db7adcfd1cec4e95155e37bc066fddab302c6340 upstream.
Move the kprobe Jcc emulation into int3_emulate_jcc() so it can be used by more code -- specifically static_call() will need this.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Masami Hiramatsu (Google) mhiramat@kernel.org Link: https://lore.kernel.org/r/20230123210607.057678245@infradead.org Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@igalia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/text-patching.h | 31 +++++++++++++++++++++++++++++ arch/x86/kernel/kprobes/core.c | 38 ++++++++---------------------------- 2 files changed, 39 insertions(+), 30 deletions(-)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h index c6015b4074614c39c740dec407b564aa3f0f396e..7281ce64e99d8a673622b906141c18a612b7ea18 100644 --- a/arch/x86/include/asm/text-patching.h +++ b/arch/x86/include/asm/text-patching.h @@ -181,6 +181,37 @@ void int3_emulate_ret(struct pt_regs *regs) unsigned long ip = int3_emulate_pop(regs); int3_emulate_jmp(regs, ip); } + +static __always_inline +void int3_emulate_jcc(struct pt_regs *regs, u8 cc, unsigned long ip, unsigned long disp) +{ + static const unsigned long jcc_mask[6] = { + [0] = X86_EFLAGS_OF, + [1] = X86_EFLAGS_CF, + [2] = X86_EFLAGS_ZF, + [3] = X86_EFLAGS_CF | X86_EFLAGS_ZF, + [4] = X86_EFLAGS_SF, + [5] = X86_EFLAGS_PF, + }; + + bool invert = cc & 1; + bool match; + + if (cc < 0xc) { + match = regs->flags & jcc_mask[cc >> 1]; + } else { + match = ((regs->flags & X86_EFLAGS_SF) >> X86_EFLAGS_SF_BIT) ^ + ((regs->flags & X86_EFLAGS_OF) >> X86_EFLAGS_OF_BIT); + if (cc >= 0xe) + match = match || (regs->flags & X86_EFLAGS_ZF); + } + + if ((match && !invert) || (!match && invert)) + ip += disp; + + int3_emulate_jmp(regs, ip); +} + #endif /* !CONFIG_UML_X86 */
#endif /* _ASM_X86_TEXT_PATCHING_H */ diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index 6d59c8e7719b1f182dec975cd360aedbda532b32..d481825f12bfee53177399ad5ad6aaf7a7fa9c6b 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -462,50 +462,26 @@ static void kprobe_emulate_call(struct kprobe *p, struct pt_regs *regs) } NOKPROBE_SYMBOL(kprobe_emulate_call);
-static nokprobe_inline -void __kprobe_emulate_jmp(struct kprobe *p, struct pt_regs *regs, bool cond) +static void kprobe_emulate_jmp(struct kprobe *p, struct pt_regs *regs) { unsigned long ip = regs->ip - INT3_INSN_SIZE + p->ainsn.size;
- if (cond) - ip += p->ainsn.rel32; + ip += p->ainsn.rel32; int3_emulate_jmp(regs, ip); } - -static void kprobe_emulate_jmp(struct kprobe *p, struct pt_regs *regs) -{ - __kprobe_emulate_jmp(p, regs, true); -} NOKPROBE_SYMBOL(kprobe_emulate_jmp);
-static const unsigned long jcc_mask[6] = { - [0] = X86_EFLAGS_OF, - [1] = X86_EFLAGS_CF, - [2] = X86_EFLAGS_ZF, - [3] = X86_EFLAGS_CF | X86_EFLAGS_ZF, - [4] = X86_EFLAGS_SF, - [5] = X86_EFLAGS_PF, -}; - static void kprobe_emulate_jcc(struct kprobe *p, struct pt_regs *regs) { - bool invert = p->ainsn.jcc.type & 1; - bool match; + unsigned long ip = regs->ip - INT3_INSN_SIZE + p->ainsn.size;
- if (p->ainsn.jcc.type < 0xc) { - match = regs->flags & jcc_mask[p->ainsn.jcc.type >> 1]; - } else { - match = ((regs->flags & X86_EFLAGS_SF) >> X86_EFLAGS_SF_BIT) ^ - ((regs->flags & X86_EFLAGS_OF) >> X86_EFLAGS_OF_BIT); - if (p->ainsn.jcc.type >= 0xe) - match = match || (regs->flags & X86_EFLAGS_ZF); - } - __kprobe_emulate_jmp(p, regs, (match && !invert) || (!match && invert)); + int3_emulate_jcc(regs, p->ainsn.jcc.type, ip, p->ainsn.rel32); } NOKPROBE_SYMBOL(kprobe_emulate_jcc);
static void kprobe_emulate_loop(struct kprobe *p, struct pt_regs *regs) { + unsigned long ip = regs->ip - INT3_INSN_SIZE + p->ainsn.size; bool match;
if (p->ainsn.loop.type != 3) { /* LOOP* */ @@ -533,7 +509,9 @@ static void kprobe_emulate_loop(struct kprobe *p, struct pt_regs *regs) else if (p->ainsn.loop.type == 1) /* LOOPE */ match = match && (regs->flags & X86_EFLAGS_ZF);
- __kprobe_emulate_jmp(p, regs, match); + if (match) + ip += p->ainsn.rel32; + int3_emulate_jmp(regs, ip); } NOKPROBE_SYMBOL(kprobe_emulate_loop);
From: Peter Zijlstra peterz@infradead.org
commit ac0ee0a9560c97fa5fe1409e450c2425d4ebd17a upstream.
In order to re-write Jcc.d32 instructions text_poke_bp() needs to be taught about them.
The biggest hurdle is that the whole machinery is currently made for 5 byte instructions and extending this would grow struct text_poke_loc which is currently a nice 16 bytes and used in an array.
However, since text_poke_loc contains a full copy of the (s32) displacement, it is possible to map the Jcc.d32 2 byte opcodes to Jcc.d8 1 byte opcode for the int3 emulation.
This then leaves the replacement bytes; fudge that by only storing the last 5 bytes and adding the rule that 'length == 6' instruction will be prefixed with a 0x0f byte.
Change-Id: Ie3f72c6b92f865d287c8940e5a87e59d41cfaa27 Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Masami Hiramatsu (Google) mhiramat@kernel.org Link: https://lore.kernel.org/r/20230123210607.115718513@infradead.org [cascardo: there is no emit_call_track_retpoline] Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@igalia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/alternative.c | 56 ++++++++++++++++++++++++++++++++++++------- 1 file changed, 47 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 9ceef8515c0312b292d67ac2d2ffbb1b7d6acd1d..70a6165d4e11899b6e0de8548255fef6b913da08 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -506,6 +506,12 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start, kasan_enable_current(); }
+static inline bool is_jcc32(struct insn *insn) +{ + /* Jcc.d32 second opcode byte is in the range: 0x80-0x8f */ + return insn->opcode.bytes[0] == 0x0f && (insn->opcode.bytes[1] & 0xf0) == 0x80; +} + #if defined(CONFIG_RETPOLINE) && defined(CONFIG_STACK_VALIDATION)
/* @@ -1331,6 +1337,11 @@ void text_poke_sync(void) on_each_cpu(do_sync_core, NULL, 1); }
+/* + * NOTE: crazy scheme to allow patching Jcc.d32 but not increase the size of + * this thing. When len == 6 everything is prefixed with 0x0f and we map + * opcode to Jcc.d8, using len to distinguish. + */ struct text_poke_loc { /* addr := _stext + rel_addr */ s32 rel_addr; @@ -1452,6 +1463,10 @@ noinstr int poke_int3_handler(struct pt_regs *regs) int3_emulate_jmp(regs, (long)ip + tp->disp); break;
+ case 0x70 ... 0x7f: /* Jcc */ + int3_emulate_jcc(regs, tp->opcode & 0xf, (long)ip, tp->disp); + break; + default: BUG(); } @@ -1525,16 +1540,26 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries * Second step: update all but the first byte of the patched range. */ for (do_sync = 0, i = 0; i < nr_entries; i++) { - u8 old[POKE_MAX_OPCODE_SIZE] = { tp[i].old, }; + u8 old[POKE_MAX_OPCODE_SIZE+1] = { tp[i].old, }; + u8 _new[POKE_MAX_OPCODE_SIZE+1]; + const u8 *new = tp[i].text; int len = tp[i].len;
if (len - INT3_INSN_SIZE > 0) { memcpy(old + INT3_INSN_SIZE, text_poke_addr(&tp[i]) + INT3_INSN_SIZE, len - INT3_INSN_SIZE); + + if (len == 6) { + _new[0] = 0x0f; + memcpy(_new + 1, new, 5); + new = _new; + } + text_poke(text_poke_addr(&tp[i]) + INT3_INSN_SIZE, - (const char *)tp[i].text + INT3_INSN_SIZE, + new + INT3_INSN_SIZE, len - INT3_INSN_SIZE); + do_sync++; }
@@ -1562,8 +1587,7 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries * The old instruction is recorded so that the event can be * processed forwards or backwards. */ - perf_event_text_poke(text_poke_addr(&tp[i]), old, len, - tp[i].text, len); + perf_event_text_poke(text_poke_addr(&tp[i]), old, len, new, len); }
if (do_sync) { @@ -1580,10 +1604,15 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries * replacing opcode. */ for (do_sync = 0, i = 0; i < nr_entries; i++) { - if (tp[i].text[0] == INT3_INSN_OPCODE) + u8 byte = tp[i].text[0]; + + if (tp[i].len == 6) + byte = 0x0f; + + if (byte == INT3_INSN_OPCODE) continue;
- text_poke(text_poke_addr(&tp[i]), tp[i].text, INT3_INSN_SIZE); + text_poke(text_poke_addr(&tp[i]), &byte, INT3_INSN_SIZE); do_sync++; }
@@ -1601,9 +1630,11 @@ static void text_poke_loc_init(struct text_poke_loc *tp, void *addr, const void *opcode, size_t len, const void *emulate) { struct insn insn; - int ret, i; + int ret, i = 0;
- memcpy((void *)tp->text, opcode, len); + if (len == 6) + i = 1; + memcpy((void *)tp->text, opcode+i, len-i); if (!emulate) emulate = opcode;
@@ -1614,6 +1645,13 @@ static void text_poke_loc_init(struct text_poke_loc *tp, void *addr, tp->len = len; tp->opcode = insn.opcode.bytes[0];
+ if (is_jcc32(&insn)) { + /* + * Map Jcc.d32 onto Jcc.d8 and use len to distinguish. + */ + tp->opcode = insn.opcode.bytes[1] - 0x10; + } + switch (tp->opcode) { case RET_INSN_OPCODE: case JMP32_INSN_OPCODE: @@ -1630,7 +1668,6 @@ static void text_poke_loc_init(struct text_poke_loc *tp, void *addr, BUG_ON(len != insn.length); };
- switch (tp->opcode) { case INT3_INSN_OPCODE: case RET_INSN_OPCODE: @@ -1639,6 +1676,7 @@ static void text_poke_loc_init(struct text_poke_loc *tp, void *addr, case CALL_INSN_OPCODE: case JMP32_INSN_OPCODE: case JMP8_INSN_OPCODE: + case 0x70 ... 0x7f: /* Jcc */ tp->disp = insn.immediate.value; break;
commit 8754e67ad4ac692c67ff1f99c0d07156f04ae40c upstream.
Due to ITS, indirect branches in the lower half of a cacheline may be vulnerable to branch target injection attack.
Introduce ITS-safe thunks to patch indirect branches in the lower half of cacheline with the thunk. Also thunk any eBPF generated indirect branches in emit_indirect_jump().
Below category of indirect branches are not mitigated:
- Indirect branches in the .init section are not mitigated because they are discarded after boot. - Indirect branches that are explicitly marked retpoline-safe.
Note that retpoline also mitigates the indirect branches against ITS. This is because the retpoline sequence fills an RSB entry before RET, and it does not suffer from RSB-underflow part of the ITS.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/Kconfig | 11 ++++++ arch/x86/include/asm/cpufeatures.h | 2 +- arch/x86/include/asm/nospec-branch.h | 5 +++ arch/x86/kernel/alternative.c | 77 ++++++++++++++++++++++++++++++++++++ arch/x86/kernel/vmlinux.lds.S | 6 +++ arch/x86/lib/retpoline.S | 28 +++++++++++++ arch/x86/net/bpf_jit_comp.c | 6 ++- 7 files changed, 133 insertions(+), 2 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 93a1f9937a9bb2d905d82f2d3c2bd2c166d01744..c399dbfcf3ad3d84aafef13156c6d03cb3e85e96 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2521,6 +2521,17 @@ config MITIGATION_RFDS stored in floating point, vector and integer registers. See also file:Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst
+config MITIGATION_ITS + bool "Enable Indirect Target Selection mitigation" + depends on CPU_SUP_INTEL && X86_64 + depends on RETPOLINE && RETHUNK + default y + help + Enable Indirect Target Selection (ITS) mitigation. ITS is a bug in + BPU on some Intel CPUs that may allow Spectre V2 style attacks. If + disabled, mitigation cannot be enabled via cmdline. + See file:Documentation/admin-guide/hw-vuln/indirect-target-selection.rst + endif
config ARCH_HAS_ADD_PAGES diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index b9aefb75eafff39ae0eaf2fca5de07e96e864234..a9ccd2ac2d7ae1dd5b9da90fe7d7bd31abb3df96 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -290,7 +290,7 @@ #define X86_FEATURE_SPLIT_LOCK_DETECT (11*32+ 6) /* #AC for split lock */ #define X86_FEATURE_PER_THREAD_MBA (11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */ #define X86_FEATURE_BHI_CTRL (11*32+ 8) /* "" BHI_DIS_S HW control available */ -/* FREE! (11*32+ 9) */ +#define X86_FEATURE_INDIRECT_THUNK_ITS (11*32+ 9) /* "" Use thunk for indirect branches in lower half of cacheline */ #define X86_FEATURE_ENTRY_IBPB (11*32+10) /* "" Issue an IBPB on kernel entry */ #define X86_FEATURE_RRSBA_CTRL (11*32+11) /* "" RET prediction control */ #define X86_FEATURE_RETPOLINE (11*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */ diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 7978d5fe1ce6e491ab0bc485611a45dae6a40bdc..4cc0ee529325e62d5b30f171933b47f31ff14a92 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -243,6 +243,11 @@ extern void (*x86_return_thunk)(void);
typedef u8 retpoline_thunk_t[RETPOLINE_THUNK_SIZE];
+#define ITS_THUNK_SIZE 64 +typedef u8 its_thunk_t[ITS_THUNK_SIZE]; + +extern its_thunk_t __x86_indirect_its_thunk_array[]; + #define GEN(reg) \ extern retpoline_thunk_t __x86_indirect_thunk_ ## reg; #include <asm/GEN-for-each-reg.h> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 70a6165d4e11899b6e0de8548255fef6b913da08..d8e16c479fd0508fceecedae19e992516aeecc9e 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -550,6 +550,74 @@ static int emit_indirect(int op, int reg, u8 *bytes) return i; }
+#ifdef CONFIG_MITIGATION_ITS + +static int __emit_trampoline(void *addr, struct insn *insn, u8 *bytes, + void *call_dest, void *jmp_dest) +{ + u8 op = insn->opcode.bytes[0]; + int i = 0; + + /* + * Clang does 'weird' Jcc __x86_indirect_thunk_r11 conditional + * tail-calls. Deal with them. + */ + if (is_jcc32(insn)) { + bytes[i++] = op; + op = insn->opcode.bytes[1]; + goto clang_jcc; + } + + if (insn->length == 6) + bytes[i++] = 0x2e; /* CS-prefix */ + + switch (op) { + case CALL_INSN_OPCODE: + __text_gen_insn(bytes+i, op, addr+i, + call_dest, + CALL_INSN_SIZE); + i += CALL_INSN_SIZE; + break; + + case JMP32_INSN_OPCODE: +clang_jcc: + __text_gen_insn(bytes+i, op, addr+i, + jmp_dest, + JMP32_INSN_SIZE); + i += JMP32_INSN_SIZE; + break; + + default: + WARN(1, "%pS %px %*ph\n", addr, addr, 6, addr); + return -1; + } + + WARN_ON_ONCE(i != insn->length); + + return i; +} + +static int emit_its_trampoline(void *addr, struct insn *insn, int reg, u8 *bytes) +{ + return __emit_trampoline(addr, insn, bytes, + __x86_indirect_its_thunk_array[reg], + __x86_indirect_its_thunk_array[reg]); +} + +/* Check if an indirect branch is at ITS-unsafe address */ +static bool cpu_wants_indirect_its_thunk_at(unsigned long addr, int reg) +{ + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return false; + + /* Indirect branch opcode is 2 or 3 bytes depending on reg */ + addr += 1 + reg / 8; + + /* Lower-half of the cacheline? */ + return !(addr & 0x20); +} +#endif + /* * Rewrite the compiler generated retpoline thunk calls. * @@ -621,6 +689,15 @@ static int patch_retpoline(void *addr, struct insn *insn, u8 *bytes) bytes[i++] = 0xe8; /* LFENCE */ }
+#ifdef CONFIG_MITIGATION_ITS + /* + * Check if the address of last byte of emitted-indirect is in + * lower-half of the cacheline. Such branches need ITS mitigation. + */ + if (cpu_wants_indirect_its_thunk_at((unsigned long)addr + i, reg)) + return emit_its_trampoline(addr, insn, reg, bytes); +#endif + ret = emit_indirect(op, reg, bytes + i); if (ret < 0) return ret; diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 740f87d8aa48142845f4d726bf0962171362deff..6cee70927281f30871cb5a9680d11c5bf44b4156 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -538,6 +538,12 @@ INIT_PER_CPU(irq_stack_backing_store); "SRSO function pair won't alias"); #endif
+#ifdef CONFIG_MITIGATION_ITS +. = ASSERT(__x86_indirect_its_thunk_rax & 0x20, "__x86_indirect_thunk_rax not in second half of cacheline"); +. = ASSERT(((__x86_indirect_its_thunk_rcx - __x86_indirect_its_thunk_rax) % 64) == 0, "Indirect thunks are not cacheline apart"); +. = ASSERT(__x86_indirect_its_thunk_array == __x86_indirect_its_thunk_rax, "Gap in ITS thunk array"); +#endif + #endif /* CONFIG_X86_32 */
#ifdef CONFIG_KEXEC_CORE diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S index d1902213a0d6379b369cde97615b8398cb080947..9bde4e9d8c2a36bfa2bc804efefefe10ffd5eae7 100644 --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -255,6 +255,34 @@ SYM_FUNC_START(entry_untrain_ret) SYM_FUNC_END(entry_untrain_ret) __EXPORT_THUNK(entry_untrain_ret)
+#ifdef CONFIG_MITIGATION_ITS + +.macro ITS_THUNK reg + +SYM_INNER_LABEL(__x86_indirect_its_thunk_\reg, SYM_L_GLOBAL) + UNWIND_HINT_EMPTY + ANNOTATE_NOENDBR + ANNOTATE_RETPOLINE_SAFE + jmp *%\reg + int3 + .align 32, 0xcc /* fill to the end of the line */ + .skip 32, 0xcc /* skip to the next upper half */ +.endm + +/* ITS mitigation requires thunks be aligned to upper half of cacheline */ +.align 64, 0xcc +.skip 32, 0xcc +SYM_CODE_START(__x86_indirect_its_thunk_array) + +#define GEN(reg) ITS_THUNK reg +#include <asm/GEN-for-each-reg.h> +#undef GEN + + .align 64, 0xcc +SYM_CODE_END(__x86_indirect_its_thunk_array) + +#endif + SYM_CODE_START(__x86_return_thunk) UNWIND_HINT_FUNC ANNOTATE_NOENDBR diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index d7d592c0929835338302146dad95248016a36808..6225e8a8349f76bb8d62bb5613823bc083ffaf4b 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -387,7 +387,11 @@ static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip) int cnt = 0;
#ifdef CONFIG_RETPOLINE - if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { + if (IS_ENABLED(CONFIG_MITIGATION_ITS) && + cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) { + OPTIMIZER_HIDE_VAR(reg); + emit_jump(&prog, &__x86_indirect_its_thunk_array[reg], ip); + } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { EMIT_LFENCE(); EMIT2(0xFF, 0xE0 + reg); } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) {
From: "Borislav Petkov (AMD)" bp@alien8.de
commit d2408e043e7296017420aa5929b3bba4d5e61013 upstream.
Instead of decoding each instruction in the return sites range only to realize that that return site is a jump to the default return thunk which is needed - X86_FEATURE_RETHUNK is enabled - lift that check before the loop and get rid of that loop overhead.
Add comments about what gets patched, while at it.
Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20230512120952.7924-1-bp@alien8.de Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/alternative.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index d8e16c479fd0508fceecedae19e992516aeecc9e..9de566a77a8e6b9ca2610edda731cd2f089e6b0b 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -775,13 +775,12 @@ static int patch_return(void *addr, struct insn *insn, u8 *bytes) { int i = 0;
+ /* Patch the custom return thunks... */ if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { - if (x86_return_thunk == __x86_return_thunk) - return -1; - i = JMP32_INSN_SIZE; __text_gen_insn(bytes, JMP32_INSN_OPCODE, addr, x86_return_thunk, i); } else { + /* ... or patch them out if not needed. */ bytes[i++] = RET_INSN_OPCODE; }
@@ -794,6 +793,14 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { s32 *s;
+ /* + * Do not patch out the default return thunks if those needed are the + * ones generated by the compiler. + */ + if (cpu_feature_enabled(X86_FEATURE_RETHUNK) && + (x86_return_thunk == __x86_return_thunk)) + return; + for (s = start; s < end; s++) { void *dest = NULL, *addr = (void *)s + *s; struct insn insn;
From: Josh Poimboeuf jpoimboe@kernel.org
commit 4ba89dd6ddeca2a733bdaed7c9a5cbe4e19d9124 upstream.
The following commit
095b8303f383 ("x86/alternative: Make custom return thunk unconditional")
made '__x86_return_thunk' a placeholder value. All code setting X86_FEATURE_RETHUNK also changes the value of 'x86_return_thunk'. So the optimization at the beginning of apply_returns() is dead code.
Also, before the above-mentioned commit, the optimization actually had a bug It bypassed __static_call_fixup(), causing some raw returns to remain unpatched in static call trampolines. Thus the 'Fixes' tag.
Fixes: d2408e043e72 ("x86/alternative: Optimize returns patching") Signed-off-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/r/16d19d2249d4485d8380fb215ffaae81e6b8119e.169388998... Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/alternative.c | 8 -------- 1 file changed, 8 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 9de566a77a8e6b9ca2610edda731cd2f089e6b0b..3102e7cf6a48375a2216303b0e1769532ed37270 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -793,14 +793,6 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { s32 *s;
- /* - * Do not patch out the default return thunks if those needed are the - * ones generated by the compiler. - */ - if (cpu_feature_enabled(X86_FEATURE_RETHUNK) && - (x86_return_thunk == __x86_return_thunk)) - return; - for (s = start; s < end; s++) { void *dest = NULL, *addr = (void *)s + *s; struct insn insn;
commit a75bf27fe41abe658c53276a0c486c4bf9adecfc upstream.
RETs in the lower half of cacheline may be affected by ITS bug, specifically when the RSB-underflows. Use ITS-safe return thunk for such RETs.
RETs that are not patched:
- RET in retpoline sequence does not need to be patched, because the sequence itself fills an RSB before RET. - RETs in .init section are not reachable after init. - RETs that are explicitly marked safe with ANNOTATE_UNRET_SAFE.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/alternative.h | 14 ++++++++++++++ arch/x86/include/asm/nospec-branch.h | 6 ++++++ arch/x86/kernel/alternative.c | 17 ++++++++++++++++- arch/x86/kernel/ftrace.c | 2 +- arch/x86/kernel/static_call.c | 2 +- arch/x86/kernel/vmlinux.lds.S | 2 ++ arch/x86/lib/retpoline.S | 13 ++++++++++++- arch/x86/net/bpf_jit_comp.c | 2 +- 8 files changed, 53 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 0e777b27972be34ceecafb5437302e649d36aaee..d7f33c1e052b7724db10fe23c1dfc63ab09dd2c9 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -80,6 +80,20 @@ extern void apply_returns(s32 *start, s32 *end);
struct module;
+#ifdef CONFIG_RETHUNK +extern bool cpu_wants_rethunk(void); +extern bool cpu_wants_rethunk_at(void *addr); +#else +static __always_inline bool cpu_wants_rethunk(void) +{ + return false; +} +static __always_inline bool cpu_wants_rethunk_at(void *addr) +{ + return false; +} +#endif + #ifdef CONFIG_SMP extern void alternatives_smp_module_add(struct module *mod, char *name, void *locks, void *locks_end, diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index 4cc0ee529325e62d5b30f171933b47f31ff14a92..7ccaefaa16a683110e58c58c7a22926e2191965f 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -226,6 +226,12 @@ extern void __x86_return_thunk(void); static inline void __x86_return_thunk(void) {} #endif
+#ifdef CONFIG_MITIGATION_ITS +extern void its_return_thunk(void); +#else +static inline void its_return_thunk(void) {} +#endif + extern void retbleed_return_thunk(void); extern void srso_return_thunk(void); extern void srso_alias_return_thunk(void); diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 3102e7cf6a48375a2216303b0e1769532ed37270..ae4a6bc25b29c19842dd281d041fb23cb5ba5a75 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -760,6 +760,21 @@ void __init_or_module noinline apply_retpolines(s32 *start, s32 *end)
#ifdef CONFIG_RETHUNK
+bool cpu_wants_rethunk(void) +{ + return cpu_feature_enabled(X86_FEATURE_RETHUNK); +} + +bool cpu_wants_rethunk_at(void *addr) +{ + if (!cpu_feature_enabled(X86_FEATURE_RETHUNK)) + return false; + if (x86_return_thunk != its_return_thunk) + return true; + + return !((unsigned long)addr & 0x20); +} + /* * Rewrite the compiler generated return thunk tail-calls. * @@ -776,7 +791,7 @@ static int patch_return(void *addr, struct insn *insn, u8 *bytes) int i = 0;
/* Patch the custom return thunks... */ - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { + if (cpu_wants_rethunk_at(addr)) { i = JMP32_INSN_SIZE; __text_gen_insn(bytes, JMP32_INSN_OPCODE, addr, x86_return_thunk, i); } else { diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index 46447877b59419984581a294b637fb99c8c781c4..fba03ad16cceb1e17a64559989660327ce09a866 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -367,7 +367,7 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) goto fail;
ip = trampoline + size; - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + if (cpu_wants_rethunk_at(ip)) __text_gen_insn(ip, JMP32_INSN_OPCODE, ip, x86_return_thunk, JMP32_INSN_SIZE); else memcpy(ip, retq, sizeof(retq)); diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c index 4544f124bbd4d6b1bc7ada65e0122b891640fa9d..42564d29eb1bacdd1e11ccca1c0cb14b8c4e10b2 100644 --- a/arch/x86/kernel/static_call.c +++ b/arch/x86/kernel/static_call.c @@ -41,7 +41,7 @@ static void __ref __static_call_transform(void *insn, enum insn_type type, break;
case RET: - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + if (cpu_wants_rethunk_at(insn)) code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else code = &retinsn; diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 6cee70927281f30871cb5a9680d11c5bf44b4156..1f77896515c5230f738b9637ea5c5920bd58a623 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -542,6 +542,8 @@ INIT_PER_CPU(irq_stack_backing_store); . = ASSERT(__x86_indirect_its_thunk_rax & 0x20, "__x86_indirect_thunk_rax not in second half of cacheline"); . = ASSERT(((__x86_indirect_its_thunk_rcx - __x86_indirect_its_thunk_rax) % 64) == 0, "Indirect thunks are not cacheline apart"); . = ASSERT(__x86_indirect_its_thunk_array == __x86_indirect_its_thunk_rax, "Gap in ITS thunk array"); + +. = ASSERT(its_return_thunk & 0x20, "its_return_thunk not in second half of cacheline"); #endif
#endif /* CONFIG_X86_32 */ diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S index 9bde4e9d8c2a36bfa2bc804efefefe10ffd5eae7..01fcf0cd679bd65f509c2ae8e334572c42f7e77b 100644 --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -281,7 +281,18 @@ SYM_CODE_START(__x86_indirect_its_thunk_array) .align 64, 0xcc SYM_CODE_END(__x86_indirect_its_thunk_array)
-#endif +.align 64, 0xcc +.skip 32, 0xcc +SYM_CODE_START(its_return_thunk) + UNWIND_HINT_FUNC + ANNOTATE_NOENDBR + ANNOTATE_UNRET_SAFE + ret + int3 +SYM_CODE_END(its_return_thunk) +EXPORT_SYMBOL(its_return_thunk) + +#endif /* CONFIG_MITIGATION_ITS */
SYM_CODE_START(__x86_return_thunk) UNWIND_HINT_FUNC diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 6225e8a8349f76bb8d62bb5613823bc083ffaf4b..c322702126407e4102cbd0a874264dd69b17b9fd 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -408,7 +408,7 @@ static void emit_return(u8 **pprog, u8 *ip) u8 *prog = *pprog; int cnt = 0;
- if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { + if (cpu_wants_rethunk()) { emit_jump(&prog, x86_return_thunk, ip); } else { EMIT1(0xC3); /* ret */
Below error was reported in a 32-bit kernel build:
static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error
This is because the definition of cpu_wants_rethunk_at() depends on CONFIG_STACK_VALIDATION which is only enabled in 64-bit mode.
Define the empty function for CONFIG_STACK_VALIDATION=n, rethunk mitigation is anyways not supported without it.
Reported-by: Guenter Roeck linux@roeck-us.net Fixes: 5d19a0574b75 ("x86/its: Add support for ITS-safe return thunk") Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Link: https://lore.kernel.org/stable/0f597436-5da6-4319-b918-9f57bde5634a@roeck-us... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/alternative.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index d7f33c1e052b7724db10fe23c1dfc63ab09dd2c9..4b9a2842f90e972a5569a363479c5885bd919287 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -80,7 +80,7 @@ extern void apply_returns(s32 *start, s32 *end);
struct module;
-#ifdef CONFIG_RETHUNK +#if defined(CONFIG_RETHUNK) && defined(CONFIG_STACK_VALIDATION) extern bool cpu_wants_rethunk(void); extern bool cpu_wants_rethunk_at(void *addr); #else
commit f4818881c47fd91fcb6d62373c57c7844e3de1c0 upstream.
Indirect Target Selection (ITS) is a bug in some pre-ADL Intel CPUs with eIBRS. It affects prediction of indirect branch and RETs in the lower half of cacheline. Due to ITS such branches may get wrongly predicted to a target of (direct or indirect) branch that is located in the upper half of the cacheline.
Scope of impact ===============
Guest/host isolation -------------------- When eIBRS is used for guest/host isolation, the indirect branches in the VMM may still be predicted with targets corresponding to branches in the guest.
Intra-mode ---------- cBPF or other native gadgets can be used for intra-mode training and disclosure using ITS.
User/kernel isolation --------------------- When eIBRS is enabled user/kernel isolation is not impacted.
Indirect Branch Prediction Barrier (IBPB) ----------------------------------------- After an IBPB, indirect branches may be predicted with targets corresponding to direct branches which were executed prior to IBPB. This is mitigated by a microcode update.
Add cmdline parameter indirect_target_selection=off|on|force to control the mitigation to relocate the affected branches to an ITS-safe thunk i.e. located in the upper half of cacheline. Also add the sysfs reporting.
When retpoline mitigation is deployed, ITS safe-thunks are not needed, because retpoline sequence is already ITS-safe. Similarly, when call depth tracking (CDT) mitigation is deployed (retbleed=stuff), ITS safe return thunk is not used, as CDT prevents RSB-underflow.
To not overcomplicate things, ITS mitigation is not supported with spectre-v2 lfence;jmp mitigation. Moreover, it is less practical to deploy lfence;jmp mitigation on ITS affected parts anyways.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/ABI/testing/sysfs-devices-system-cpu | 1 + Documentation/admin-guide/kernel-parameters.txt | 13 +++ arch/x86/kernel/cpu/bugs.c | 128 ++++++++++++++++++++- drivers/base/cpu.c | 8 ++ include/linux/cpu.h | 2 + 5 files changed, 149 insertions(+), 3 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 2a273bfebed057feff4c776b8c2d1c14d84feefb..194be3f4f72fbf3a5d22c7f4b68f946002f68681 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -502,6 +502,7 @@ Description: information about CPUs heterogeneity.
What: /sys/devices/system/cpu/vulnerabilities /sys/devices/system/cpu/vulnerabilities/gather_data_sampling + /sys/devices/system/cpu/vulnerabilities/indirect_target_selection /sys/devices/system/cpu/vulnerabilities/itlb_multihit /sys/devices/system/cpu/vulnerabilities/l1tf /sys/devices/system/cpu/vulnerabilities/mds diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 12af5b0ecc8e3d7deaf8d80bc94ef462731800f4..544ab81880f09357b6e1d9b255b08b9dc70e31d2 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1851,6 +1851,18 @@ different crypto accelerators. This option can be used to achieve best performance for particular HW.
+ indirect_target_selection= [X86,Intel] Mitigation control for Indirect + Target Selection(ITS) bug in Intel CPUs. Updated + microcode is also required for a fix in IBPB. + + on: Enable mitigation (default). + off: Disable mitigation. + force: Force the ITS bug and deploy default + mitigation. + + For details see: + Documentation/admin-guide/hw-vuln/indirect-target-selection.rst + init= [KNL] Format: <full_path> Run specified binary instead of /sbin/init as init @@ -2938,6 +2950,7 @@ improves system performance, but it may also expose users to several CPU vulnerabilities. Equivalent to: gather_data_sampling=off [X86] + indirect_target_selection=off [X86] kpti=0 [ARM64] kvm.nx_huge_pages=off [X86] l1tf=off [X86] diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 9b3611e4cb80c2f8dd4f510226781c4b8fdbf541..5d555db06ff51f0a27b9a9fd40d6ba11077c2409 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -47,6 +47,7 @@ static void __init mmio_select_mitigation(void); static void __init srbds_select_mitigation(void); static void __init gds_select_mitigation(void); static void __init srso_select_mitigation(void); +static void __init its_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR without task-specific bits set */ u64 x86_spec_ctrl_base; @@ -63,6 +64,14 @@ static DEFINE_MUTEX(spec_ctrl_mutex);
void (*x86_return_thunk)(void) __ro_after_init = &__x86_return_thunk;
+static void __init set_return_thunk(void *thunk) +{ + if (x86_return_thunk != __x86_return_thunk) + pr_warn("x86/bugs: return thunk changed\n"); + + x86_return_thunk = thunk; +} + /* Update SPEC_CTRL MSR and its cached copy unconditionally */ static void update_spec_ctrl(u64 val) { @@ -161,6 +170,7 @@ void __init cpu_select_mitigations(void) */ srso_select_mitigation(); gds_select_mitigation(); + its_select_mitigation(); }
/* @@ -1050,7 +1060,7 @@ static void __init retbleed_select_mitigation(void) setup_force_cpu_cap(X86_FEATURE_UNRET);
if (IS_ENABLED(CONFIG_RETHUNK)) - x86_return_thunk = retbleed_return_thunk; + set_return_thunk(retbleed_return_thunk);
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD && boot_cpu_data.x86_vendor != X86_VENDOR_HYGON) @@ -1111,6 +1121,105 @@ static void __init retbleed_select_mitigation(void) pr_info("%s\n", retbleed_strings[retbleed_mitigation]); }
+#undef pr_fmt +#define pr_fmt(fmt) "ITS: " fmt + +enum its_mitigation_cmd { + ITS_CMD_OFF, + ITS_CMD_ON, +}; + +enum its_mitigation { + ITS_MITIGATION_OFF, + ITS_MITIGATION_ALIGNED_THUNKS, +}; + +static const char * const its_strings[] = { + [ITS_MITIGATION_OFF] = "Vulnerable", + [ITS_MITIGATION_ALIGNED_THUNKS] = "Mitigation: Aligned branch/return thunks", +}; + +static enum its_mitigation its_mitigation __ro_after_init = ITS_MITIGATION_ALIGNED_THUNKS; + +static enum its_mitigation_cmd its_cmd __ro_after_init = + IS_ENABLED(CONFIG_MITIGATION_ITS) ? ITS_CMD_ON : ITS_CMD_OFF; + +static int __init its_parse_cmdline(char *str) +{ + if (!str) + return -EINVAL; + + if (!IS_ENABLED(CONFIG_MITIGATION_ITS)) { + pr_err("Mitigation disabled at compile time, ignoring option (%s)", str); + return 0; + } + + if (!strcmp(str, "off")) { + its_cmd = ITS_CMD_OFF; + } else if (!strcmp(str, "on")) { + its_cmd = ITS_CMD_ON; + } else if (!strcmp(str, "force")) { + its_cmd = ITS_CMD_ON; + setup_force_cpu_bug(X86_BUG_ITS); + } else { + pr_err("Ignoring unknown indirect_target_selection option (%s).", str); + } + + return 0; +} +early_param("indirect_target_selection", its_parse_cmdline); + +static void __init its_select_mitigation(void) +{ + enum its_mitigation_cmd cmd = its_cmd; + + if (!boot_cpu_has_bug(X86_BUG_ITS) || cpu_mitigations_off()) { + its_mitigation = ITS_MITIGATION_OFF; + return; + } + + /* Exit early to avoid irrelevant warnings */ + if (cmd == ITS_CMD_OFF) { + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (spectre_v2_enabled == SPECTRE_V2_NONE) { + pr_err("WARNING: Spectre-v2 mitigation is off, disabling ITS\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (!IS_ENABLED(CONFIG_RETPOLINE) || !IS_ENABLED(CONFIG_RETHUNK)) { + pr_err("WARNING: ITS mitigation depends on retpoline and rethunk support\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (IS_ENABLED(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B)) { + pr_err("WARNING: ITS mitigation is not compatible with CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (boot_cpu_has(X86_FEATURE_RETPOLINE_LFENCE)) { + pr_err("WARNING: ITS mitigation is not compatible with lfence mitigation\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + + switch (cmd) { + case ITS_CMD_OFF: + its_mitigation = ITS_MITIGATION_OFF; + break; + case ITS_CMD_ON: + its_mitigation = ITS_MITIGATION_ALIGNED_THUNKS; + if (!boot_cpu_has(X86_FEATURE_RETPOLINE)) + setup_force_cpu_cap(X86_FEATURE_INDIRECT_THUNK_ITS); + setup_force_cpu_cap(X86_FEATURE_RETHUNK); + set_return_thunk(its_return_thunk); + break; + } +out: + pr_info("%s\n", its_strings[its_mitigation]); +} + #undef pr_fmt #define pr_fmt(fmt) "Spectre V2 : " fmt
@@ -2457,10 +2566,10 @@ static void __init srso_select_mitigation(void)
if (boot_cpu_data.x86 == 0x19) { setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS); - x86_return_thunk = srso_alias_return_thunk; + set_return_thunk(srso_alias_return_thunk); } else { setup_force_cpu_cap(X86_FEATURE_SRSO); - x86_return_thunk = srso_return_thunk; + set_return_thunk(srso_return_thunk); } srso_mitigation = SRSO_MITIGATION_SAFE_RET; } else { @@ -2640,6 +2749,11 @@ static ssize_t rfds_show_state(char *buf) return sysfs_emit(buf, "%s\n", rfds_strings[rfds_mitigation]); }
+static ssize_t its_show_state(char *buf) +{ + return sysfs_emit(buf, "%s\n", its_strings[its_mitigation]); +} + static char *stibp_state(void) { if (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && @@ -2804,6 +2918,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr case X86_BUG_RFDS: return rfds_show_state(buf);
+ case X86_BUG_ITS: + return its_show_state(buf); + default: break; } @@ -2883,4 +3000,9 @@ ssize_t cpu_show_reg_file_data_sampling(struct device *dev, struct device_attrib { return cpu_show_common(dev, attr, buf, X86_BUG_RFDS); } + +ssize_t cpu_show_indirect_target_selection(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_ITS); +} #endif diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index e3aed8333f0976d4f5b0aee568e0a00437403d3d..0b238634d4624e8c5bf9f5ca63129d5768486c50 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -597,6 +597,12 @@ ssize_t __weak cpu_show_reg_file_data_sampling(struct device *dev, return sysfs_emit(buf, "Not affected\n"); }
+ssize_t __weak cpu_show_indirect_target_selection(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "Not affected\n"); +} + static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); @@ -611,6 +617,7 @@ static DEVICE_ATTR(retbleed, 0444, cpu_show_retbleed, NULL); static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL); static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL); static DEVICE_ATTR(reg_file_data_sampling, 0444, cpu_show_reg_file_data_sampling, NULL); +static DEVICE_ATTR(indirect_target_selection, 0444, cpu_show_indirect_target_selection, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_meltdown.attr, @@ -627,6 +634,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_gather_data_sampling.attr, &dev_attr_spec_rstack_overflow.attr, &dev_attr_reg_file_data_sampling.attr, + &dev_attr_indirect_target_selection.attr, NULL };
diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 2099226d8623816fe30d6fc02049cca615ece63a..f7ce86fc28830cccecb33e05c0abeab91771532c 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -76,6 +76,8 @@ extern ssize_t cpu_show_gds(struct device *dev, struct device_attribute *attr, char *buf); extern ssize_t cpu_show_reg_file_data_sampling(struct device *dev, struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_indirect_target_selection(struct device *dev, + struct device_attribute *attr, char *buf);
extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata,
commit 2665281a07e19550944e8354a2024635a7b2714a upstream.
Ice Lake generation CPUs are not affected by guest/host isolation part of ITS. If a user is only concerned about KVM guests, they can now choose a new cmdline option "vmexit" that will not deploy the ITS mitigation when CPU is not affected by guest/host isolation. This saves the performance overhead of ITS mitigation on Ice Lake gen CPUs.
When "vmexit" option selected, if the CPU is affected by ITS guest/host isolation, the default ITS mitigation is deployed.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/kernel-parameters.txt | 2 ++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/bugs.c | 11 +++++++++++ arch/x86/kernel/cpu/common.c | 19 ++++++++++++------- 4 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 544ab81880f09357b6e1d9b255b08b9dc70e31d2..564089fd0af0cb392400bf0a58643993cf4b667c 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1859,6 +1859,8 @@ off: Disable mitigation. force: Force the ITS bug and deploy default mitigation. + vmexit: Only deploy mitigation if CPU is affected by + guest/host isolation part of ITS.
For details see: Documentation/admin-guide/hw-vuln/indirect-target-selection.rst diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index a9ccd2ac2d7ae1dd5b9da90fe7d7bd31abb3df96..e2dc271e6f39583325a5bd8098c59339cc52e9c8 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -460,4 +460,5 @@ #define X86_BUG_BHI X86_BUG(1*32 + 3) /* CPU is affected by Branch History Injection */ #define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */ #define X86_BUG_ITS X86_BUG(1*32 + 5) /* CPU is affected by Indirect Target Selection */ +#define X86_BUG_ITS_NATIVE_ONLY X86_BUG(1*32 + 6) /* CPU is affected by ITS, VMX is not affected */ #endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 5d555db06ff51f0a27b9a9fd40d6ba11077c2409..282995c465042947c91db71d7e684edd9dd0d5e9 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1127,15 +1127,18 @@ static void __init retbleed_select_mitigation(void) enum its_mitigation_cmd { ITS_CMD_OFF, ITS_CMD_ON, + ITS_CMD_VMEXIT, };
enum its_mitigation { ITS_MITIGATION_OFF, + ITS_MITIGATION_VMEXIT_ONLY, ITS_MITIGATION_ALIGNED_THUNKS, };
static const char * const its_strings[] = { [ITS_MITIGATION_OFF] = "Vulnerable", + [ITS_MITIGATION_VMEXIT_ONLY] = "Mitigation: Vulnerable, KVM: Not affected", [ITS_MITIGATION_ALIGNED_THUNKS] = "Mitigation: Aligned branch/return thunks", };
@@ -1161,6 +1164,8 @@ static int __init its_parse_cmdline(char *str) } else if (!strcmp(str, "force")) { its_cmd = ITS_CMD_ON; setup_force_cpu_bug(X86_BUG_ITS); + } else if (!strcmp(str, "vmexit")) { + its_cmd = ITS_CMD_VMEXIT; } else { pr_err("Ignoring unknown indirect_target_selection option (%s).", str); } @@ -1208,6 +1213,12 @@ static void __init its_select_mitigation(void) case ITS_CMD_OFF: its_mitigation = ITS_MITIGATION_OFF; break; + case ITS_CMD_VMEXIT: + if (boot_cpu_has_bug(X86_BUG_ITS_NATIVE_ONLY)) { + its_mitigation = ITS_MITIGATION_VMEXIT_ONLY; + goto out; + } + fallthrough; case ITS_CMD_ON: its_mitigation = ITS_MITIGATION_ALIGNED_THUNKS; if (!boot_cpu_has(X86_FEATURE_RETPOLINE)) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index a5f4f8e637713569f2671dbddaf8203415560256..ca412c2882aa67731dab48ce97f55c142134ad01 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1136,6 +1136,8 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { #define RFDS BIT(7) /* CPU is affected by Indirect Target Selection */ #define ITS BIT(8) +/* CPU is affected by Indirect Target Selection, but guest-host isolation is not affected */ +#define ITS_NATIVE_ONLY BIT(9)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS), @@ -1156,16 +1158,16 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0xc), MMIO | RETBLEED | GDS | SRBDS), VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS | ITS), VULNBL_INTEL_STEPPINGS(CANNONLAKE_L, X86_STEPPING_ANY, RETBLEED), - VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), - VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS | ITS), - VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS | ITS), + VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), + VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS | ITS | ITS_NATIVE_ONLY), + VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED | ITS), VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), - VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS | ITS), - VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS | ITS), + VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS | ITS | ITS_NATIVE_ONLY), + VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED), - VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPPINGS(ALDERLAKE, X86_STEPPING_ANY, RFDS), VULNBL_INTEL_STEPPINGS(ALDERLAKE_L, X86_STEPPING_ANY, RFDS), VULNBL_INTEL_STEPPINGS(RAPTORLAKE, X86_STEPPING_ANY, RFDS), @@ -1369,8 +1371,11 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET)) setup_force_cpu_bug(X86_BUG_IBPB_NO_RET);
- if (vulnerable_to_its(ia32_cap)) + if (vulnerable_to_its(ia32_cap)) { setup_force_cpu_bug(X86_BUG_ITS); + if (cpu_matches(cpu_vuln_blacklist, ITS_NATIVE_ONLY)) + setup_force_cpu_bug(X86_BUG_ITS_NATIVE_ONLY); + }
if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return;
From: Thomas Gleixner tglx@linutronix.de
commit 4c4eb3ecc91f4fee6d6bf7cfbc1e21f2e38d19ff upstream.
Instead of resetting permissions all over the place when freeing module memory tell the vmalloc code to do so. Avoids the exercise for the next upcoming user.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220915111143.406703869@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/ftrace.c | 2 -- arch/x86/kernel/kprobes/core.c | 1 - arch/x86/kernel/module.c | 8 ++++---- 3 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index fba03ad16cceb1e17a64559989660327ce09a866..bb02d5d474c28370c9e420caeb9209d2569514b0 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -422,8 +422,6 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) /* ALLOC_TRAMP flags lets us know we created it */ ops->flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
- set_vm_flush_reset_perms(trampoline); - if (likely(system_state != SYSTEM_BOOTING)) set_memory_ro((unsigned long)trampoline, npages); set_memory_x((unsigned long)trampoline, npages); diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index d481825f12bfee53177399ad5ad6aaf7a7fa9c6b..5254317125d89514539137f18e1f3ebeedea45dd 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -403,7 +403,6 @@ void *alloc_insn_page(void) if (!page) return NULL;
- set_vm_flush_reset_perms(page); /* * First make the page read-only, and only then make it executable to * prevent it from being W+X in between. diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index 455e195847f9e9502b4cc9a3d470f60be3ae70d7..e28c701e8f8b0f09324dc7af7f27e66e2c5482e0 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -73,10 +73,10 @@ void *module_alloc(unsigned long size) return NULL;
p = __vmalloc_node_range(size, MODULE_ALIGN, - MODULES_VADDR + get_module_load_offset(), - MODULES_END, GFP_KERNEL, - PAGE_KERNEL, 0, NUMA_NO_NODE, - __builtin_return_address(0)); + MODULES_VADDR + get_module_load_offset(), + MODULES_END, GFP_KERNEL, PAGE_KERNEL, + VM_FLUSH_RESET_PERMS, NUMA_NO_NODE, + __builtin_return_address(0)); if (p && (kasan_module_alloc(p, size) < 0)) { vfree(p); return NULL;
From: Peter Zijlstra peterz@infradead.org
commit 872df34d7c51a79523820ea6a14860398c639b87 upstream.
ITS mitigation moves the unsafe indirect branches to a safe thunk. This could degrade the prediction accuracy as the source address of indirect branches becomes same for different execution paths.
To improve the predictions, and hence the performance, assign a separate thunk for each indirect callsite. This is also a defense-in-depth measure to avoid indirect branches aliasing with each other.
As an example, 5000 dynamic thunks would utilize around 16 bits of the address space, thereby gaining entropy. For a BTB that uses 32 bits for indexing, dynamic thunks could provide better prediction accuracy over fixed thunks.
Have ITS thunks be variable sized and use EXECMEM_MODULE_TEXT such that they are both more flexible (got to extend them later) and live in 2M TLBs, just like kernel code, avoiding undue TLB pressure.
[ pawan: CONFIG_EXECMEM and CONFIG_EXECMEM_ROX are not supported on backport kernel, made changes to use module_alloc() and set_memory_*() for dynamic thunks. ]
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/alternative.h | 10 +++ arch/x86/kernel/alternative.c | 133 ++++++++++++++++++++++++++++++++++++- arch/x86/kernel/module.c | 6 ++ include/linux/module.h | 5 ++ 4 files changed, 151 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 4b9a2842f90e972a5569a363479c5885bd919287..4ab021fd3887211723e0bf0f68683e713f260f07 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -80,6 +80,16 @@ extern void apply_returns(s32 *start, s32 *end);
struct module;
+#ifdef CONFIG_MITIGATION_ITS +extern void its_init_mod(struct module *mod); +extern void its_fini_mod(struct module *mod); +extern void its_free_mod(struct module *mod); +#else /* CONFIG_MITIGATION_ITS */ +static inline void its_init_mod(struct module *mod) { } +static inline void its_fini_mod(struct module *mod) { } +static inline void its_free_mod(struct module *mod) { } +#endif + #if defined(CONFIG_RETHUNK) && defined(CONFIG_STACK_VALIDATION) extern bool cpu_wants_rethunk(void); extern bool cpu_wants_rethunk_at(void *addr); diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index ae4a6bc25b29c19842dd281d041fb23cb5ba5a75..33d17fe841828738ee248e3c2dbd2693ffd33672 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -18,6 +18,7 @@ #include <linux/mmu_context.h> #include <linux/bsearch.h> #include <linux/sync_core.h> +#include <linux/moduleloader.h> #include <asm/text-patching.h> #include <asm/alternative.h> #include <asm/sections.h> @@ -29,6 +30,7 @@ #include <asm/io.h> #include <asm/fixmap.h> #include <asm/asm-prototypes.h> +#include <asm/set_memory.h>
int __read_mostly alternatives_patched;
@@ -552,6 +554,127 @@ static int emit_indirect(int op, int reg, u8 *bytes)
#ifdef CONFIG_MITIGATION_ITS
+static struct module *its_mod; +static void *its_page; +static unsigned int its_offset; + +/* Initialize a thunk with the "jmp *reg; int3" instructions. */ +static void *its_init_thunk(void *thunk, int reg) +{ + u8 *bytes = thunk; + int i = 0; + + if (reg >= 8) { + bytes[i++] = 0x41; /* REX.B prefix */ + reg -= 8; + } + bytes[i++] = 0xff; + bytes[i++] = 0xe0 + reg; /* jmp *reg */ + bytes[i++] = 0xcc; + + return thunk; +} + +void its_init_mod(struct module *mod) +{ + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return; + + mutex_lock(&text_mutex); + its_mod = mod; + its_page = NULL; +} + +void its_fini_mod(struct module *mod) +{ + int i; + + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return; + + WARN_ON_ONCE(its_mod != mod); + + its_mod = NULL; + its_page = NULL; + mutex_unlock(&text_mutex); + + for (i = 0; i < mod->its_num_pages; i++) { + void *page = mod->its_page_array[i]; + set_memory_ro((unsigned long)page, 1); + set_memory_x((unsigned long)page, 1); + } +} + +void its_free_mod(struct module *mod) +{ + int i; + + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return; + + for (i = 0; i < mod->its_num_pages; i++) { + void *page = mod->its_page_array[i]; + module_memfree(page); + } + kfree(mod->its_page_array); +} + +static void *its_alloc(void) +{ + void *page = module_alloc(PAGE_SIZE); + + if (!page) + return NULL; + + if (its_mod) { + void *tmp = krealloc(its_mod->its_page_array, + (its_mod->its_num_pages+1) * sizeof(void *), + GFP_KERNEL); + if (!tmp) { + module_memfree(page); + return NULL; + } + + its_mod->its_page_array = tmp; + its_mod->its_page_array[its_mod->its_num_pages++] = page; + } + + return page; +} + +static void *its_allocate_thunk(int reg) +{ + int size = 3 + (reg / 8); + void *thunk; + + if (!its_page || (its_offset + size - 1) >= PAGE_SIZE) { + its_page = its_alloc(); + if (!its_page) { + pr_err("ITS page allocation failed\n"); + return NULL; + } + memset(its_page, INT3_INSN_OPCODE, PAGE_SIZE); + its_offset = 32; + } + + /* + * If the indirect branch instruction will be in the lower half + * of a cacheline, then update the offset to reach the upper half. + */ + if ((its_offset + size - 1) % 64 < 32) + its_offset = ((its_offset - 1) | 0x3F) + 33; + + thunk = its_page + its_offset; + its_offset += size; + + set_memory_rw((unsigned long)its_page, 1); + thunk = its_init_thunk(thunk, reg); + set_memory_ro((unsigned long)its_page, 1); + set_memory_x((unsigned long)its_page, 1); + + return thunk; +} + static int __emit_trampoline(void *addr, struct insn *insn, u8 *bytes, void *call_dest, void *jmp_dest) { @@ -599,9 +722,13 @@ static int __emit_trampoline(void *addr, struct insn *insn, u8 *bytes,
static int emit_its_trampoline(void *addr, struct insn *insn, int reg, u8 *bytes) { - return __emit_trampoline(addr, insn, bytes, - __x86_indirect_its_thunk_array[reg], - __x86_indirect_its_thunk_array[reg]); + u8 *thunk = __x86_indirect_its_thunk_array[reg]; + u8 *tmp = its_allocate_thunk(reg); + + if (tmp) + thunk = tmp; + + return __emit_trampoline(addr, insn, bytes, thunk, thunk); }
/* Check if an indirect branch is at ITS-unsafe address */ diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index e28c701e8f8b0f09324dc7af7f27e66e2c5482e0..1ab8e583c795d16d00535d4e5f56e494ea8e32a1 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -274,10 +274,15 @@ int module_finalize(const Elf_Ehdr *hdr, returns = s; }
+ its_init_mod(me); + if (retpolines) { void *rseg = (void *)retpolines->sh_addr; apply_retpolines(rseg, rseg + retpolines->sh_size); } + + its_fini_mod(me); + if (returns) { void *rseg = (void *)returns->sh_addr; apply_returns(rseg, rseg + returns->sh_size); @@ -313,4 +318,5 @@ int module_finalize(const Elf_Ehdr *hdr, void module_arch_cleanup(struct module *mod) { alternatives_smp_module_del(mod); + its_free_mod(mod); } diff --git a/include/linux/module.h b/include/linux/module.h index 63fe94e6ae6f19a0d340aefdb31e3791c9df325c..f5a150c42918c63ff563038b521c9506d8b415a2 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -524,6 +524,11 @@ struct module { atomic_t refcnt; #endif
+#ifdef CONFIG_MITIGATION_ITS + int its_num_pages; + void **its_page_array; +#endif + #ifdef CONFIG_CONSTRUCTORS /* Constructor functions. */ ctor_fn_t *ctors;
From: Eric Biggers ebiggers@google.com
commit 9f35e33144ae5377d6a8de86dd3bd4d995c6ac65 upstream.
Fix several build errors when CONFIG_MODULES=n, including the following:
../arch/x86/kernel/alternative.c:195:25: error: incomplete definition of type 'struct module' 195 | for (int i = 0; i < mod->its_num_pages; i++) {
[ pawan: backport: Bring ITS dynamic thunk code under CONFIG_MODULES ]
Fixes: 872df34d7c51 ("x86/its: Use dynamic thunks for indirect branches") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Acked-by: Dave Hansen dave.hansen@intel.com Tested-by: Steven Rostedt (Google) rostedt@goodmis.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/alternative.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 33d17fe841828738ee248e3c2dbd2693ffd33672..923b4d1e98dbc378270e254960fcb7ce1bc3e665 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -554,6 +554,7 @@ static int emit_indirect(int op, int reg, u8 *bytes)
#ifdef CONFIG_MITIGATION_ITS
+#ifdef CONFIG_MODULES static struct module *its_mod; static void *its_page; static unsigned int its_offset; @@ -674,6 +675,14 @@ static void *its_allocate_thunk(int reg)
return thunk; } +#else /* CONFIG_MODULES */ + +static void *its_allocate_thunk(int reg) +{ + return NULL; +} + +#endif /* CONFIG_MODULES */
static int __emit_trampoline(void *addr, struct insn *insn, u8 *bytes, void *call_dest, void *jmp_dest)
From: Peter Zijlstra peterz@infradead.org
commit e52c1dc7455d32c8a55f9949d300e5e87d011fa6 upstream.
FineIBT-paranoid was using the retpoline bytes for the paranoid check, disabling retpolines, because all parts that have IBT also have eIBRS and thus don't need no stinking retpolines.
Except... ITS needs the retpolines for indirect calls must not be in the first half of a cacheline :-/
So what was the paranoid call sequence:
<fineibt_paranoid_start>: 0: 41 ba 78 56 34 12 mov $0x12345678, %r10d 6: 45 3b 53 f7 cmp -0x9(%r11), %r10d a: 4d 8d 5b <f0> lea -0x10(%r11), %r11 e: 75 fd jne d <fineibt_paranoid_start+0xd> 10: 41 ff d3 call *%r11 13: 90 nop
Now becomes:
<fineibt_paranoid_start>: 0: 41 ba 78 56 34 12 mov $0x12345678, %r10d 6: 45 3b 53 f7 cmp -0x9(%r11), %r10d a: 4d 8d 5b f0 lea -0x10(%r11), %r11 e: 2e e8 XX XX XX XX cs call __x86_indirect_paranoid_thunk_r11
Where the paranoid_thunk looks like:
1d: <ea> (bad) __x86_indirect_paranoid_thunk_r11: 1e: 75 fd jne 1d __x86_indirect_its_thunk_r11: 20: 41 ff eb jmp *%r11 23: cc int3
[ dhansen: remove initialization to false ]
[ pawan: move the its_static_thunk() definition to alternative.c. This is done to avoid a build failure due to circular dependency between kernel.h(asm-generic/bug.h) and asm/alternative.h which is neeed for WARN_ONCE(). ]
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com [ Just a portion of the original commit, in order to fix a build issue in stable kernels due to backports ] Tested-by: Holger Hoffstätte holger@applied-asynchrony.com Link: https://lore.kernel.org/r/20250514113952.GB16434@noisy.programming.kicks-ass... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com --- arch/x86/include/asm/alternative.h | 2 ++ arch/x86/kernel/alternative.c | 19 ++++++++++++++++++- arch/x86/net/bpf_jit_comp.c | 2 +- 3 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 4ab021fd3887211723e0bf0f68683e713f260f07..f2cce0cd87a65c4a9ad2d934cc8c8bb651971576 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -80,6 +80,8 @@ extern void apply_returns(s32 *start, s32 *end);
struct module;
+extern u8 *its_static_thunk(int reg); + #ifdef CONFIG_MITIGATION_ITS extern void its_init_mod(struct module *mod); extern void its_fini_mod(struct module *mod); diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 923b4d1e98dbc378270e254960fcb7ce1bc3e665..30dc73210c2ea4c14de38810f47db1d5381ce378 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -752,7 +752,24 @@ static bool cpu_wants_indirect_its_thunk_at(unsigned long addr, int reg) /* Lower-half of the cacheline? */ return !(addr & 0x20); } -#endif + +u8 *its_static_thunk(int reg) +{ + u8 *thunk = __x86_indirect_its_thunk_array[reg]; + + return thunk; +} + +#else /* CONFIG_MITIGATION_ITS */ + +u8 *its_static_thunk(int reg) +{ + WARN_ONCE(1, "ITS not compiled in"); + + return NULL; +} + +#endif /* CONFIG_MITIGATION_ITS */
/* * Rewrite the compiler generated retpoline thunk calls. diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index c322702126407e4102cbd0a874264dd69b17b9fd..d3450088569ff3364f456dfa5f2e06ca90b80fe7 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -390,7 +390,7 @@ static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip) if (IS_ENABLED(CONFIG_MITIGATION_ITS) && cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) { OPTIMIZER_HIDE_VAR(reg); - emit_jump(&prog, &__x86_indirect_its_thunk_array[reg], ip); + emit_jump(&prog, its_static_thunk(reg), ip); } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { EMIT_LFENCE(); EMIT2(0xFF, 0xE0 + reg);
linux-stable-mirror@lists.linaro.org