v2:
- This includes the backport of recently upstreamed mitigation of a CPU
vulnerability Register File Data Sampling (RFDS) (CVE-2023-28746).
This is because RFDS has a dependency on "Delay VERW" series, and it
is convenient to merge them together.
- rebased to v5.15.151
v1: https://lore.kernel.org/r/20240304-delay-verw-backport-5-15-y-v1-0-fd02afc0…
This is the backport of recently upstreamed series that moves VERW
execution to a later point in exit-to-user path. This is needed because
in some cases it may be possible for data accessed after VERW executions
may end into MDS affected CPU buffers. Moving VERW closer to ring
transition reduces the attack surface.
- The series includes a dependency commit f87bc8dc7a7c ("x86/asm: Add
_ASM_RIP() macro for x86-64 (%rip) suffix").
- Patch 2 includes a change that adds runtime patching for jmp (instead
of verw in original series) due to lack of rip-relative relocation
support in kernels <v6.5.
- Fixed warning:
arch/x86/entry/entry.o: warning: objtool: mds_verw_sel+0x0: unreachable instruction.
- Resolved merge conflicts in:
swapgs_restore_regs_and_return_to_usermode in entry_64.S.
__vmx_vcpu_run in vmenter.S.
vmx_update_fb_clear_dis in vmx.c.
- Boot tested with KASLR and KPTI enabled.
- Verified VERW being executed with mitigation ON, and not being
executed with mitigation turned OFF.
To: stable(a)vger.kernel.org
Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com>
---
---
H. Peter Anvin (Intel) (1):
x86/asm: Add _ASM_RIP() macro for x86-64 (%rip) suffix
Pawan Gupta (9):
x86/bugs: Add asm helpers for executing VERW
x86/entry_64: Add VERW just before userspace transition
x86/entry_32: Add VERW just before userspace transition
x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
KVM/VMX: Move VERW closer to VMentry for MDS mitigation
x86/mmio: Disable KVM mitigation when X86_FEATURE_CLEAR_CPU_BUF is set
Documentation/hw-vuln: Add documentation for RFDS
x86/rfds: Mitigate Register File Data Sampling (RFDS)
KVM/x86: Export RFDS_NO and RFDS_CLEAR to guests
Sean Christopherson (1):
KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
Documentation/ABI/testing/sysfs-devices-system-cpu | 1 +
Documentation/admin-guide/hw-vuln/index.rst | 1 +
.../admin-guide/hw-vuln/reg-file-data-sampling.rst | 104 ++++++++++++++++++++
Documentation/admin-guide/kernel-parameters.txt | 21 ++++
Documentation/x86/mds.rst | 38 +++++---
arch/x86/Kconfig | 11 +++
arch/x86/entry/entry.S | 23 +++++
arch/x86/entry/entry_32.S | 3 +
arch/x86/entry/entry_64.S | 11 +++
arch/x86/entry/entry_64_compat.S | 1 +
arch/x86/include/asm/asm.h | 5 +
arch/x86/include/asm/cpufeatures.h | 3 +-
arch/x86/include/asm/entry-common.h | 1 -
arch/x86/include/asm/msr-index.h | 8 ++
arch/x86/include/asm/nospec-branch.h | 27 +++---
arch/x86/kernel/cpu/bugs.c | 107 ++++++++++++++++++---
arch/x86/kernel/cpu/common.c | 38 +++++++-
arch/x86/kernel/nmi.c | 3 -
arch/x86/kvm/vmx/run_flags.h | 7 +-
arch/x86/kvm/vmx/vmenter.S | 9 +-
arch/x86/kvm/vmx/vmx.c | 12 ++-
arch/x86/kvm/x86.c | 5 +-
drivers/base/cpu.c | 8 ++
include/linux/cpu.h | 2 +
24 files changed, 394 insertions(+), 55 deletions(-)
---
base-commit: 57436264850706f50887bbb2148ee2cc797c9485
change-id: 20240304-delay-verw-backport-5-15-y-e16f07fbb71e
Best regards,
--
Thanks,
Pawan
From: Josh Poimboeuf <jpoimboe(a)kernel.org>
[ Upstream commit b388e57d4628eb22782bdad4cd5b83ca87a1b7c9 ]
For CONFIG_RETHUNK kernels, objtool annotates all the function return
sites so they can be patched during boot. By design, after
apply_returns() is called, all tail-calls to the compiler-generated
default return thunk (__x86_return_thunk) should be patched out and
replaced with whatever's needed for any mitigations (or lack thereof).
The commit
4461438a8405 ("x86/retpoline: Ensure default return thunk isn't used at runtime")
adds a runtime check and a WARN_ONCE() if the default return thunk ever
gets executed after alternatives have been applied. This warning is
a sanity check to make sure objtool and apply_returns() are doing their
job.
As Nathan reported, that check found something:
Unpatched return thunk in use. This should not happen!
WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/bugs.c:2856 __warn_thunk+0x27/0x40
RIP: 0010:__warn_thunk+0x27/0x40
Call Trace:
<TASK>
? show_regs
? __warn
? __warn_thunk
? report_bug
? console_unlock
? handle_bug
? exc_invalid_op
? asm_exc_invalid_op
? ia32_binfmt_init
? __warn_thunk
warn_thunk_thunk
do_one_initcall
kernel_init_freeable
? __pfx_kernel_init
kernel_init
ret_from_fork
? __pfx_kernel_init
ret_from_fork_asm
</TASK>
Boris debugged to find that the unpatched return site was in
init_vdso_image_64(), and its translation unit wasn't being analyzed by
objtool, so it never got annotated. So it got ignored by
apply_returns().
This is only a minor issue, as this function is only called during boot.
Still, objtool needs full visibility to the kernel. Fix it by enabling
objtool on vdso-image-{32,64}.o.
Note this problem can only be seen with !CONFIG_X86_KERNEL_IBT, as that
requires objtool to run individually on all translation units rather on
vmlinux.o.
[ bp: Massage commit message. ]
Reported-by: Nathan Chancellor <nathan(a)kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe(a)kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Link: https://lore.kernel.org/r/20240215032049.GA3944823@dev-arch.thelio-3990X
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/x86/entry/vdso/Makefile | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index b1b8dd1608f7e..4ee59121b9053 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -34,8 +34,12 @@ obj-y += vma.o extable.o
KASAN_SANITIZE_vma.o := y
UBSAN_SANITIZE_vma.o := y
KCSAN_SANITIZE_vma.o := y
-OBJECT_FILES_NON_STANDARD_vma.o := n
-OBJECT_FILES_NON_STANDARD_extable.o := n
+
+OBJECT_FILES_NON_STANDARD_extable.o := n
+OBJECT_FILES_NON_STANDARD_vdso-image-32.o := n
+OBJECT_FILES_NON_STANDARD_vdso-image-64.o := n
+OBJECT_FILES_NON_STANDARD_vdso32-setup.o := n
+OBJECT_FILES_NON_STANDARD_vma.o := n
# vDSO images to build
vdso_img-$(VDSO64-y) += 64
@@ -43,7 +47,6 @@ vdso_img-$(VDSOX32-y) += x32
vdso_img-$(VDSO32-y) += 32
obj-$(VDSO32-y) += vdso32-setup.o
-OBJECT_FILES_NON_STANDARD_vdso32-setup.o := n
vobjs := $(foreach F,$(vobjs-y),$(obj)/$F)
vobjs32 := $(foreach F,$(vobjs32-y),$(obj)/$F)
--
2.43.0
Hi Stable Team,
In 5.15, unmapping large kvm vms on arm64 can generate softlockups. My team has
been hitting this when tearing down VMs > 100Gb in size.
Oliver fixed this with the attached patches. They've been in mainline since
6.1.
I tested on 5.15.150 with these patches applied. When they're present,
both the dirty_log_perf_test detailed in the second patch, and
kvm_page_table_test no longer generate softlockups when unmapping VMs
with large memory configurations.
Would you please consider these patches for inclusion in an upcoming 5.15
release?
Thanks,
-K
Oliver Upton (2):
KVM: arm64: Work out supported block level at compile time
KVM: arm64: Limit stage2_apply_range() batch size to largest block
arch/arm64/include/asm/kvm_pgtable.h | 18 +++++++++++++-----
arch/arm64/include/asm/stage2_pgtable.h | 20 --------------------
arch/arm64/kvm/mmu.c | 9 ++++++++-
3 files changed, 21 insertions(+), 26 deletions(-)
--
2.25.1