From: Sean Christopherson <seanjc(a)google.com>
When loading guest XSAVE state via KVM_SET_XSAVE, and when updating XFD in
response to a guest WRMSR, clear XFD-disabled features in the saved (or to
be restored) XSTATE_BV to ensure KVM doesn't attempt to load state for
features that are disabled via the guest's XFD. Because the kernel
executes XRSTOR with the guest's XFD, saving XSTATE_BV[i]=1 with XFD[i]=1
will cause XRSTOR to #NM and panic the kernel.
E.g. if fpu_update_guest_xfd() sets XFD without clearing XSTATE_BV:
------------[ cut here ]------------
WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#29: amx_test/848
Modules linked in: kvm_intel kvm irqbypass
CPU: 29 UID: 1000 PID: 848 Comm: amx_test Not tainted 6.19.0-rc2-ffa07f7fd437-x86_amx_nm_xfd_non_init-vm #171 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:exc_device_not_available+0x101/0x110
Call Trace:
<TASK>
asm_exc_device_not_available+0x1a/0x20
RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
switch_fpu_return+0x4a/0xb0
kvm_arch_vcpu_ioctl_run+0x1245/0x1e40 [kvm]
kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
__x64_sys_ioctl+0x8f/0xd0
do_syscall_64+0x62/0x940
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
---[ end trace 0000000000000000 ]---
This can happen if the guest executes WRMSR(MSR_IA32_XFD) to set XFD[18] = 1,
and a host IRQ triggers kernel_fpu_begin() prior to the vmexit handler's
call to fpu_update_guest_xfd().
and if userspace stuffs XSTATE_BV[i]=1 via KVM_SET_XSAVE:
------------[ cut here ]------------
WARNING: arch/x86/kernel/traps.c:1524 at exc_device_not_available+0x101/0x110, CPU#14: amx_test/867
Modules linked in: kvm_intel kvm irqbypass
CPU: 14 UID: 1000 PID: 867 Comm: amx_test Not tainted 6.19.0-rc2-2dace9faccd6-x86_amx_nm_xfd_non_init-vm #168 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:exc_device_not_available+0x101/0x110
Call Trace:
<TASK>
asm_exc_device_not_available+0x1a/0x20
RIP: 0010:restore_fpregs_from_fpstate+0x36/0x90
fpu_swap_kvm_fpstate+0x6b/0x120
kvm_load_guest_fpu+0x30/0x80 [kvm]
kvm_arch_vcpu_ioctl_run+0x85/0x1e40 [kvm]
kvm_vcpu_ioctl+0x2c3/0x8f0 [kvm]
__x64_sys_ioctl+0x8f/0xd0
do_syscall_64+0x62/0x940
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
---[ end trace 0000000000000000 ]---
The new behavior is consistent with the AMX architecture. Per Intel's SDM,
XSAVE saves XSTATE_BV as '0' for components that are disabled via XFD
(and non-compacted XSAVE saves the initial configuration of the state
component):
If XSAVE, XSAVEC, XSAVEOPT, or XSAVES is saving the state component i,
the instruction does not generate #NM when XCR0[i] = IA32_XFD[i] = 1;
instead, it operates as if XINUSE[i] = 0 (and the state component was
in its initial state): it saves bit i of XSTATE_BV field of the XSAVE
header as 0; in addition, XSAVE saves the initial configuration of the
state component (the other instructions do not save state component i).
Alternatively, KVM could always do XRSTOR with XFD=0, e.g. by using
a constant XFD based on the set of enabled features when XSAVEing for
a struct fpu_guest. However, having XSTATE_BV[i]=1 for XFD-disabled
features can only happen in the above interrupt case, or in similar
scenarios involving preemption on preemptible kernels, because
fpu_swap_kvm_fpstate()'s call to save_fpregs_to_fpstate() saves the
outgoing FPU state with the current XFD; and that is (on all but the
first WRMSR to XFD) the guest XFD.
Therefore, XFD can only go out of sync with XSTATE_BV in the above
interrupt case, or in similar scenarios involving preemption on
preemptible kernels, and it we can consider it (de facto) part of KVM
ABI that KVM_GET_XSAVE returns XSTATE_BV[i]=0 for XFD-disabled features.
Reported-by: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: stable(a)vger.kernel.org
Fixes: 820a6ee944e7 ("kvm: x86: Add emulation for IA32_XFD", 2022-01-14)
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
[Move clearing of XSTATE_BV from fpu_copy_uabi_to_guest_fpstate
to kvm_vcpu_ioctl_x86_set_xsave. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
---
arch/x86/kernel/fpu/core.c | 32 +++++++++++++++++++++++++++++---
arch/x86/kvm/x86.c | 9 +++++++++
2 files changed, 38 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index da233f20ae6f..166c380b0161 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -319,10 +319,29 @@ EXPORT_SYMBOL_FOR_KVM(fpu_enable_guest_xfd_features);
#ifdef CONFIG_X86_64
void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd)
{
+ struct fpstate *fpstate = guest_fpu->fpstate;
+
fpregs_lock();
- guest_fpu->fpstate->xfd = xfd;
- if (guest_fpu->fpstate->in_use)
- xfd_update_state(guest_fpu->fpstate);
+
+ /*
+ * KVM's guest ABI is that setting XFD[i]=1 *can* immediately revert
+ * the save state to initialized. Likewise, KVM_GET_XSAVE does the
+ * same as XSAVE and returns XSTATE_BV[i]=0 whenever XFD[i]=1.
+ *
+ * If the guest's FPU state is in hardware, just update XFD: the XSAVE
+ * in fpu_swap_kvm_fpstate will clear XSTATE_BV[i] whenever XFD[i]=1.
+ *
+ * If however the guest's FPU state is NOT resident in hardware, clear
+ * disabled components in XSTATE_BV now, or a subsequent XRSTOR will
+ * attempt to load disabled components and generate #NM _in the host_.
+ */
+ if (xfd && test_thread_flag(TIF_NEED_FPU_LOAD))
+ fpstate->regs.xsave.header.xfeatures &= ~xfd;
+
+ fpstate->xfd = xfd;
+ if (fpstate->in_use)
+ xfd_update_state(fpstate);
+
fpregs_unlock();
}
EXPORT_SYMBOL_FOR_KVM(fpu_update_guest_xfd);
@@ -430,6 +449,13 @@ int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf,
if (ustate->xsave.header.xfeatures & ~xcr0)
return -EINVAL;
+ /*
+ * Disabled features must be in their initial state, otherwise XRSTOR
+ * causes an exception.
+ */
+ if (WARN_ON_ONCE(ustate->xsave.header.xfeatures & kstate->xfd))
+ return -EINVAL;
+
/*
* Nullify @vpkru to preserve its current value if PKRU's bit isn't set
* in the header. KVM's odd ABI is to leave PKRU untouched in this
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ff8812f3a129..c0416f53b5f5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5807,9 +5807,18 @@ static int kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
struct kvm_xsave *guest_xsave)
{
+ union fpregs_state *xstate = (union fpregs_state *)guest_xsave->region;
+
if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
return vcpu->kvm->arch.has_protected_state ? -EINVAL : 0;
+ /*
+ * Do not reject non-initialized disabled features for backwards
+ * compatibility, but clear XSTATE_BV[i] whenever XFD[i]=1.
+ * Otherwise, XRSTOR would cause a #NM.
+ */
+ xstate->xsave.header.xfeatures &= ~vcpu->arch.guest_fpu.fpstate->xfd;
+
return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu,
guest_xsave->region,
kvm_caps.supported_xcr0,
--
2.52.0
The patch below does not apply to the 6.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.18.y
git checkout FETCH_HEAD
git cherry-pick -x 8a0e4bdddd1c998b894d879a1d22f1e745606215
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025122925-victory-numeral-2346@gregkh' --subject-prefix 'PATCH 6.18.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8a0e4bdddd1c998b894d879a1d22f1e745606215 Mon Sep 17 00:00:00 2001
From: Wei Yang <richard.weiyang(a)gmail.com>
Date: Thu, 6 Nov 2025 03:41:55 +0000
Subject: [PATCH] mm/huge_memory: merge uniform_split_supported() and
non_uniform_split_supported()
uniform_split_supported() and non_uniform_split_supported() share
significantly similar logic.
The only functional difference is that uniform_split_supported() includes
an additional check on the requested @new_order.
The reason for this check comes from the following two aspects:
* some file system or swap cache just supports order-0 folio
* the behavioral difference between uniform/non-uniform split
The behavioral difference between uniform split and non-uniform:
* uniform split splits folio directly to @new_order
* non-uniform split creates after-split folios with orders from
folio_order(folio) - 1 to new_order.
This means for non-uniform split or !new_order split we should check the
file system and swap cache respectively.
This commit unifies the logic and merge the two functions into a single
combined helper, removing redundant code and simplifying the split
support checking mechanism.
Link: https://lkml.kernel.org/r/20251106034155.21398-3-richard.weiyang@gmail.com
Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages")
Signed-off-by: Wei Yang <richard.weiyang(a)gmail.com>
Reviewed-by: Zi Yan <ziy(a)nvidia.com>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: "David Hildenbrand (Red Hat)" <david(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: Dev Jain <dev.jain(a)arm.com>
Cc: Lance Yang <lance.yang(a)linux.dev>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Nico Pache <npache(a)redhat.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index b74708dc5b5f..19d4a5f52ca2 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -374,10 +374,8 @@ int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list
unsigned int new_order, bool unmapped);
int min_order_for_split(struct folio *folio);
int split_folio_to_list(struct folio *folio, struct list_head *list);
-bool uniform_split_supported(struct folio *folio, unsigned int new_order,
- bool warns);
-bool non_uniform_split_supported(struct folio *folio, unsigned int new_order,
- bool warns);
+bool folio_split_supported(struct folio *folio, unsigned int new_order,
+ enum split_type split_type, bool warns);
int folio_split(struct folio *folio, unsigned int new_order, struct page *page,
struct list_head *list);
@@ -408,7 +406,7 @@ static inline int split_huge_page_to_order(struct page *page, unsigned int new_o
static inline int try_folio_split_to_order(struct folio *folio,
struct page *page, unsigned int new_order)
{
- if (!non_uniform_split_supported(folio, new_order, /* warns= */ false))
+ if (!folio_split_supported(folio, new_order, SPLIT_TYPE_NON_UNIFORM, /* warns= */ false))
return split_huge_page_to_order(&folio->page, new_order);
return folio_split(folio, new_order, page, NULL);
}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 4118f330c55e..d79a4bb363de 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3593,8 +3593,8 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
return 0;
}
-bool non_uniform_split_supported(struct folio *folio, unsigned int new_order,
- bool warns)
+bool folio_split_supported(struct folio *folio, unsigned int new_order,
+ enum split_type split_type, bool warns)
{
if (folio_test_anon(folio)) {
/* order-1 is not supported for anonymous THP. */
@@ -3602,48 +3602,41 @@ bool non_uniform_split_supported(struct folio *folio, unsigned int new_order,
"Cannot split to order-1 folio");
if (new_order == 1)
return false;
- } else if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
- !mapping_large_folio_support(folio->mapping)) {
- /*
- * No split if the file system does not support large folio.
- * Note that we might still have THPs in such mappings due to
- * CONFIG_READ_ONLY_THP_FOR_FS. But in that case, the mapping
- * does not actually support large folios properly.
- */
- VM_WARN_ONCE(warns,
- "Cannot split file folio to non-0 order");
- return false;
- }
-
- /* Only swapping a whole PMD-mapped folio is supported */
- if (folio_test_swapcache(folio)) {
- VM_WARN_ONCE(warns,
- "Cannot split swapcache folio to non-0 order");
- return false;
- }
-
- return true;
-}
-
-/* See comments in non_uniform_split_supported() */
-bool uniform_split_supported(struct folio *folio, unsigned int new_order,
- bool warns)
-{
- if (folio_test_anon(folio)) {
- VM_WARN_ONCE(warns && new_order == 1,
- "Cannot split to order-1 folio");
- if (new_order == 1)
- return false;
- } else if (new_order) {
+ } else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
!mapping_large_folio_support(folio->mapping)) {
+ /*
+ * We can always split a folio down to a single page
+ * (new_order == 0) uniformly.
+ *
+ * For any other scenario
+ * a) uniform split targeting a large folio
+ * (new_order > 0)
+ * b) any non-uniform split
+ * we must confirm that the file system supports large
+ * folios.
+ *
+ * Note that we might still have THPs in such
+ * mappings, which is created from khugepaged when
+ * CONFIG_READ_ONLY_THP_FOR_FS is enabled. But in that
+ * case, the mapping does not actually support large
+ * folios properly.
+ */
VM_WARN_ONCE(warns,
"Cannot split file folio to non-0 order");
return false;
}
}
- if (new_order && folio_test_swapcache(folio)) {
+ /*
+ * swapcache folio could only be split to order 0
+ *
+ * non-uniform split creates after-split folios with orders from
+ * folio_order(folio) - 1 to new_order, making it not suitable for any
+ * swapcache folio split. Only uniform split to order-0 can be used
+ * here.
+ */
+ if ((split_type == SPLIT_TYPE_NON_UNIFORM || new_order) && folio_test_swapcache(folio)) {
VM_WARN_ONCE(warns,
"Cannot split swapcache folio to non-0 order");
return false;
@@ -3711,11 +3704,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
if (new_order >= old_order)
return -EINVAL;
- if (split_type == SPLIT_TYPE_UNIFORM && !uniform_split_supported(folio, new_order, true))
- return -EINVAL;
-
- if (split_type == SPLIT_TYPE_NON_UNIFORM &&
- !non_uniform_split_supported(folio, new_order, true))
+ if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
return -EINVAL;
is_hzp = is_huge_zero_folio(folio);
kstack_offset was previously maintained per-cpu, but this caused a
couple of issues. So let's instead make it per-task.
Issue 1: add_random_kstack_offset() and choose_random_kstack_offset()
expected and required to be called with interrupts and preemption
disabled so that it could manipulate per-cpu state. But arm64, loongarch
and risc-v are calling them with interrupts and preemption enabled. I
don't _think_ this causes any functional issues, but it's certainly
unexpected and could lead to manipulating the wrong cpu's state, which
could cause a minor performance degradation due to bouncing the cache
lines. By maintaining the state per-task those functions can safely be
called in preemptible context.
Issue 2: add_random_kstack_offset() is called before executing the
syscall and expands the stack using a previously chosen rnadom offset.
choose_random_kstack_offset() is called after executing the syscall and
chooses and stores a new random offset for the next syscall. With
per-cpu storage for this offset, an attacker could force cpu migration
during the execution of the syscall and prevent the offset from being
updated for the original cpu such that it is predictable for the next
syscall on that cpu. By maintaining the state per-task, this problem
goes away because the per-task random offset is updated after the
syscall regardless of which cpu it is executing on.
Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall")
Closes: https://lore.kernel.org/all/dd8c37bc-795f-4c7a-9086-69e584d8ab24@arm.com/
Cc: stable(a)vger.kernel.org
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
---
include/linux/randomize_kstack.h | 26 +++++++++++++++-----------
include/linux/sched.h | 4 ++++
init/main.c | 1 -
kernel/fork.c | 2 ++
4 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h
index 1d982dbdd0d0..5d3916ca747c 100644
--- a/include/linux/randomize_kstack.h
+++ b/include/linux/randomize_kstack.h
@@ -9,7 +9,6 @@
DECLARE_STATIC_KEY_MAYBE(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
randomize_kstack_offset);
-DECLARE_PER_CPU(u32, kstack_offset);
/*
* Do not use this anywhere else in the kernel. This is used here because
@@ -50,15 +49,14 @@ DECLARE_PER_CPU(u32, kstack_offset);
* add_random_kstack_offset - Increase stack utilization by previously
* chosen random offset
*
- * This should be used in the syscall entry path when interrupts and
- * preempt are disabled, and after user registers have been stored to
- * the stack. For testing the resulting entropy, please see:
- * tools/testing/selftests/lkdtm/stack-entropy.sh
+ * This should be used in the syscall entry path after user registers have been
+ * stored to the stack. Preemption may be enabled. For testing the resulting
+ * entropy, please see: tools/testing/selftests/lkdtm/stack-entropy.sh
*/
#define add_random_kstack_offset() do { \
if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
&randomize_kstack_offset)) { \
- u32 offset = raw_cpu_read(kstack_offset); \
+ u32 offset = current->kstack_offset; \
u8 *ptr = __kstack_alloca(KSTACK_OFFSET_MAX(offset)); \
/* Keep allocation even after "ptr" loses scope. */ \
asm volatile("" :: "r"(ptr) : "memory"); \
@@ -69,9 +67,9 @@ DECLARE_PER_CPU(u32, kstack_offset);
* choose_random_kstack_offset - Choose the random offset for the next
* add_random_kstack_offset()
*
- * This should only be used during syscall exit when interrupts and
- * preempt are disabled. This position in the syscall flow is done to
- * frustrate attacks from userspace attempting to learn the next offset:
+ * This should only be used during syscall exit. Preemption may be enabled. This
+ * position in the syscall flow is done to frustrate attacks from userspace
+ * attempting to learn the next offset:
* - Maximize the timing uncertainty visible from userspace: if the
* offset is chosen at syscall entry, userspace has much more control
* over the timing between choosing offsets. "How long will we be in
@@ -85,14 +83,20 @@ DECLARE_PER_CPU(u32, kstack_offset);
#define choose_random_kstack_offset(rand) do { \
if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
&randomize_kstack_offset)) { \
- u32 offset = raw_cpu_read(kstack_offset); \
+ u32 offset = current->kstack_offset; \
offset = ror32(offset, 5) ^ (rand); \
- raw_cpu_write(kstack_offset, offset); \
+ current->kstack_offset = offset; \
} \
} while (0)
+
+static inline void random_kstack_task_init(struct task_struct *tsk)
+{
+ tsk->kstack_offset = 0;
+}
#else /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
#define add_random_kstack_offset() do { } while (0)
#define choose_random_kstack_offset(rand) do { } while (0)
+#define random_kstack_task_init(tsk) do { } while (0)
#endif /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
#endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d395f2810fac..9e0080ed1484 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1591,6 +1591,10 @@ struct task_struct {
unsigned long prev_lowest_stack;
#endif
+#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
+ u32 kstack_offset;
+#endif
+
#ifdef CONFIG_X86_MCE
void __user *mce_vaddr;
__u64 mce_kflags;
diff --git a/init/main.c b/init/main.c
index b84818ad9685..27fcbbde933e 100644
--- a/init/main.c
+++ b/init/main.c
@@ -830,7 +830,6 @@ static inline void initcall_debug_enable(void)
#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
DEFINE_STATIC_KEY_MAYBE_RO(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT,
randomize_kstack_offset);
-DEFINE_PER_CPU(u32, kstack_offset);
static int __init early_randomize_kstack_offset(char *buf)
{
diff --git a/kernel/fork.c b/kernel/fork.c
index b1f3915d5f8e..b061e1edbc43 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -95,6 +95,7 @@
#include <linux/thread_info.h>
#include <linux/kstack_erase.h>
#include <linux/kasan.h>
+#include <linux/randomize_kstack.h>
#include <linux/scs.h>
#include <linux/io_uring.h>
#include <linux/bpf.h>
@@ -2231,6 +2232,7 @@ __latent_entropy struct task_struct *copy_process(
if (retval)
goto bad_fork_cleanup_io;
+ random_kstack_task_init(p);
stackleak_task_init(p);
if (pid != &init_struct_pid) {
--
2.43.0
Synopsys renamed DWC_usb32 IP to DWC_usb4 as of IP version 1.30. No
functional change except checking for the IP_NAME here. The driver will
treat the new IP_NAME as if it's DWC_usb32. Additional features for USB4
will be introduced and checked separately.
Cc: stable(a)vger.kernel.org
Signed-off-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
---
drivers/usb/dwc3/core.c | 2 ++
drivers/usb/dwc3/core.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 96f85eada047..f71b75465a60 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -993,6 +993,8 @@ static bool dwc3_core_is_valid(struct dwc3 *dwc)
reg = dwc3_readl(dwc->regs, DWC3_GSNPSID);
dwc->ip = DWC3_GSNPS_ID(reg);
+ if (dwc->ip == DWC4_IP)
+ dwc->ip = DWC32_IP;
/* This should read as U3 followed by revision number */
if (DWC3_IP_IS(DWC3)) {
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index a5fc92c4ffa3..45757169b672 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1265,6 +1265,7 @@ struct dwc3 {
#define DWC3_IP 0x5533
#define DWC31_IP 0x3331
#define DWC32_IP 0x3332
+#define DWC4_IP 0x3430
u32 revision;
base-commit: 18514fd70ea4ca9de137bb3bceeac1bac4bcad75
--
2.28.0
Hi,
syzbot reported a circular locking dependency in the NET/ROM routing
code involving nr_neigh_list_lock, nr_node_list_lock and
nr_node->node_lock when nr_rt_device_down() interacts with the
ioctl path. This series fixes that deadlock and also addresses a
long-standing reference count leak found while auditing the same
code.
Patch 1/2 refactors nr_rt_device_down() to avoid nested locking
between nr_neigh_list_lock and nr_node_list_lock by doing two
separate passes over nodes and neighbours, and adjusts nr_rt_free()
to follow the same lock ordering.
Patch 2/2 fixes a per-route reference count leak by dropping
nr_neigh->count and calling nr_neigh_put() when removing routes
from nr_rt_device_down(), mirroring the behaviour of
nr_dec_obs()/nr_del_node().
[1] https://syzkaller.appspot.com/bug?extid=14afda08dc3484d5db82
Thanks,
Junjie
Hi stable maintainers,
I have tried backporting some fixes to stable kernel 6.12.y which also
have CVE numbers and are fixing commits in 6.12.y.
I am not a subsystem expert and have only done overall testing that we
do for stable release candidate testing and not any patch specific testing.
Note: All these patches are present backports from upstream.
PATCH 1: The broken commit is in 6.12.y, and the fix is a clean
cherry-pick and addresses CVE-2025-68206
PATCH 2: The broken commit is present in 6.12.y and the fix is a clean
cherry-pick and addresses CVE-2025-40325.
PATCH 3: The broken commit is present in 6.12.y and backport needed a
minor conflict resolution due to missing commit fe69a3918084
("drm/panthor: Fix UAF in panthor_gem_create_with_handle() debugfs
in 6.12.y
PATCH 4,5,6: Patch 4 and 5 are pulled in as prerequisites for PATCH 6 which
is a fix for CVE-2025-40170 and needed a minor conflict resolution due to missing
commit: 22d6c9eebf2e ("net: Unexport shared functions for DCCP.") in 6.12.y
PATCH 7: The broken commit in present in 6.12.y and the backport of the
fix needed a minor conflict resolution due to missing commit in 6.12.y.
This is fix for CVE-2025-40164.
Please let me know if there are any comments.
Regards,
Harshit
Andrii Melnychenko (1):
netfilter: nft_ct: add seqadj extension for natted connections
Boris Brezillon (1):
drm/panthor: Flush shmem writes before mapping buffers CPU-uncached
Eric Dumazet (2):
ipv6: adopt dst_dev() helper
net: use dst_dev_rcu() in sk_setup_caps()
Justin Iurman (1):
net: ipv6: ioam6: use consistent dst names
Xiao Ni (1):
md/raid10: wait barrier before returning discard request with
REQ_NOWAIT
Zqiang (1):
usbnet: Fix using smp_processor_id() in preemptible code warnings
drivers/gpu/drm/panthor/panthor_gem.c | 18 +++++++++++++
drivers/md/raid10.c | 3 +--
drivers/net/usb/usbnet.c | 2 ++
include/net/ip.h | 6 +++--
include/net/ip6_route.h | 4 +--
include/net/route.h | 2 +-
net/core/sock.c | 16 +++++++-----
net/ipv6/exthdrs.c | 2 +-
net/ipv6/icmp.c | 4 ++-
net/ipv6/ila/ila_lwt.c | 2 +-
net/ipv6/ioam6_iptunnel.c | 37 ++++++++++++++-------------
net/ipv6/ip6_gre.c | 8 +++---
net/ipv6/ip6_output.c | 19 +++++++-------
net/ipv6/ip6_tunnel.c | 4 +--
net/ipv6/ip6_udp_tunnel.c | 2 +-
net/ipv6/ip6_vti.c | 2 +-
net/ipv6/ndisc.c | 6 +++--
net/ipv6/netfilter/nf_dup_ipv6.c | 2 +-
net/ipv6/output_core.c | 2 +-
net/ipv6/route.c | 20 +++++++++------
net/ipv6/rpl_iptunnel.c | 4 +--
net/ipv6/seg6_iptunnel.c | 20 ++++++++-------
net/ipv6/seg6_local.c | 2 +-
net/netfilter/nft_ct.c | 5 ++++
24 files changed, 118 insertions(+), 74 deletions(-)
--
2.50.1
On x86_64:
When the second-stage kernel is booted via kexec with a limiting command
line such as "mem=<size>" we observe a pafe fault that happens.
BUG: unable to handle page fault for address: ffff97793ff47000
RIP: ima_restore_measurement_list+0xdc/0x45a
#PF: error_code(0x0000) – not-present page
This happens on x86_64 only, as this is already fixed in aarch64 in
commit: cbf9c4b9617b ("of: check previous kernel's ima-kexec-buffer
against memory bounds")
V1: https://lore.kernel.org/all/20251112193005.3772542-1-harshit.m.mogalapalli@…
V1 attempted to do a similar sanity check in x86_64. Borislav suggested
to add a generic helper ima_validate_range() which could then be used
for both OF based and x86_64.
Testing information:
--------------------
On x86_64: With latest 6.19-rc2 based, we could reproduce the issue, and
patched kernel works fine. (with mem=8G on a 16G memory machine)
Thanks to Yifei for finding enabling IMA_KEXEC is the cause.
Thanks for the reviews on V1.
V1 -> V2:
- Patch 1: Add a generic helper "ima_validate_range()"
- Patch 2: Use this new helper in drivers/of/kexec.c -> No functional
change.
- Patch 3: Fix the page fault by doing sanity check with
"ima_validate_range()"
V2: https://lore.kernel.org/all/20251229081523.622515-1-harshit.m.mogalapalli@o…
V2 -> V3:
Update subject of Patch 1 to more appropriate one (Suggested by Mimi
Zohar)
Thanks,
Harshit
Harshit Mogalapalli (3):
ima: verify the previous kernel's IMA buffer lies in addressable RAM
of/kexec: refactor ima_get_kexec_buffer() to use ima_validate_range()
x86/kexec: Add a sanity check on previous kernel's ima kexec buffer
arch/x86/kernel/setup.c | 6 +++++
drivers/of/kexec.c | 15 +++----------
include/linux/ima.h | 1 +
security/integrity/ima/ima_kexec.c | 35 ++++++++++++++++++++++++++++++
4 files changed, 45 insertions(+), 12 deletions(-)
--
2.50.1