The patch titled
Subject: mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge
Date: Fri, 2 Jan 2026 20:55:20 +0000
Commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA
merges") introduced the ability to merge previously unavailable VMA merge
scenarios.
The key piece of logic introduced was the ability to merge a faulted VMA
immediately next to an unfaulted VMA, which relies upon dup_anon_vma() to
correctly handle anon_vma state.
In the case of the merge of an existing VMA (that is changing properties
of a VMA and then merging if those properties are shared by adjacent
VMAs), dup_anon_vma() is invoked correctly.
However in the case of the merge of a new VMA, a corner case peculiar to
mremap() was missed.
The issue is that vma_expand() only performs dup_anon_vma() if the target
(the VMA that will ultimately become the merged VMA): is not the next VMA,
i.e. the one that appears after the range in which the new VMA is to be
established.
A key insight here is that in all other cases other than mremap(), a new
VMA merge either expands an existing VMA, meaning that the target VMA will
be that VMA, or would have anon_vma be NULL.
Specifically:
* __mmap_region() - no anon_vma in place, initial mapping.
* do_brk_flags() - expanding an existing VMA.
* vma_merge_extend() - expanding an existing VMA.
* relocate_vma_down() - no anon_vma in place, initial mapping.
In addition, we are in the unique situation of needing to duplicate
anon_vma state from a VMA that is neither the previous or next VMA being
merged with.
To account for this, introduce a new field in struct vma_merge_struct
specifically for the mremap() case, and update vma_expand() to explicitly
check for this case and invoke dup_anon_vma() to ensure anon_vma state is
correctly propagated.
This issue can be observed most directly by invoked mremap() to move
around a VMA and cause this kind of merge with the MREMAP_DONTUNMAP flag
specified.
This will result in unlink_anon_vmas() being called after failing to
duplicate anon_vma state to the target VMA, which results in the anon_vma
itself being freed with folios still possessing dangling pointers to the
anon_vma and thus a use-after-free bug.
This bug was discovered via a syzbot report, which this patch resolves.
The following program reproduces the issue (and is fixed by this patch):
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#define RESERVED_PGS (100)
#define VMA_A_PGS (10)
#define VMA_B_PGS (10)
#define NUM_ITERS (1000)
static void trigger_bug(void)
{
unsigned long page_size = sysconf(_SC_PAGE_SIZE);
char *reserved, *ptr_a, *ptr_b;
/*
* The goal here is to achieve:
*
* mremap() with MREMAP_DONTUNMAP such that A and B merge:
*
* |-------------------------|
* | |
* | |-----------| |---------|
* v | unfaulted | | faulted |
* |-----------| |---------|
* B A
*
* Then unmap VMA A to trigger the bug.
*/
/* Reserve a region of memory to operate in. */
reserved = mmap(NULL, RESERVED_PGS * page_size, PROT_NONE,
MAP_PRIVATE | MAP_ANON, -1, 0);
if (reserved == MAP_FAILED) {
perror("mmap reserved");
exit(EXIT_FAILURE);
}
/* Map VMA A into place. */
ptr_a = mmap(&reserved[page_size], VMA_A_PGS * page_size,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
if (ptr_a == MAP_FAILED) {
perror("mmap VMA A");
exit(EXIT_FAILURE);
}
/* Fault it in. */
ptr_a[0] = 'x';
/*
* Now move it out of the way so we can place VMA B in position,
* unfaulted.
*/
ptr_a = mremap(ptr_a, VMA_A_PGS * page_size, VMA_A_PGS * page_size,
MREMAP_FIXED | MREMAP_MAYMOVE, &reserved[50 * page_size]);
if (ptr_a == MAP_FAILED) {
perror("mremap VMA A out of the way");
exit(EXIT_FAILURE);
}
/* Map VMA B into place. */
ptr_b = mmap(&reserved[page_size + VMA_A_PGS * page_size],
VMA_B_PGS * page_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
if (ptr_b == MAP_FAILED) {
perror("mmap VMA B");
exit(EXIT_FAILURE);
}
/* Now move VMA A into position w/MREMAP_DONTUNMAP + free anon_vma. */
ptr_a = mremap(ptr_a, VMA_A_PGS * page_size, VMA_A_PGS * page_size,
MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP,
&reserved[page_size]);
if (ptr_a == MAP_FAILED) {
perror("mremap VMA A with MREMAP_DONTUNMAP");
exit(EXIT_FAILURE);
}
/* Finally, unmap VMA A which should trigger the bug. */
munmap(ptr_a, VMA_A_PGS * page_size);
/* Cleanup in case bug didn't trigger sufficiently visibly... */
munmap(reserved, RESERVED_PGS * page_size);
}
int main(void)
{
int i;
for (i = 0; i < NUM_ITERS; i++)
trigger_bug();
return EXIT_SUCCESS;
}
Link: https://lkml.kernel.org/r/20260102205520.986725-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges")
Reported-by: syzbot+b165fc2e11771c66d8ba(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/694a2745.050a0220.19928e.0017.GAE@google.com/
Cc: David Hildenbrand (Red Hat) <david(a)kernel.org>
Cc: Jann Horn <jannh(a)google.com>
Cc: Jeongjun Park <aha310510(a)gmail.com>
Cc: levi.yun <yeoreum.yun(a)arm.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Pedro Falcato <pfalcato(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vma.c | 58 ++++++++++++++++++++++++++++++++++++++++-------------
mm/vma.h | 3 ++
2 files changed, 47 insertions(+), 14 deletions(-)
--- a/mm/vma.c~mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge
+++ a/mm/vma.c
@@ -1130,26 +1130,50 @@ int vma_expand(struct vma_merge_struct *
mmap_assert_write_locked(vmg->mm);
vma_start_write(target);
- if (next && (target != next) && (vmg->end == next->vm_end)) {
+ if (next && vmg->end == next->vm_end) {
+ struct vm_area_struct *copied_from = vmg->copied_from;
int ret;
- sticky_flags |= next->vm_flags & VM_STICKY;
- remove_next = true;
- /* This should already have been checked by this point. */
- VM_WARN_ON_VMG(!can_merge_remove_vma(next), vmg);
- vma_start_write(next);
- /*
- * In this case we don't report OOM, so vmg->give_up_on_mm is
- * safe.
- */
- ret = dup_anon_vma(target, next, &anon_dup);
- if (ret)
- return ret;
+ if (target != next) {
+ sticky_flags |= next->vm_flags & VM_STICKY;
+ remove_next = true;
+ /* This should already have been checked by this point. */
+ VM_WARN_ON_VMG(!can_merge_remove_vma(next), vmg);
+ vma_start_write(next);
+ /*
+ * In this case we don't report OOM, so vmg->give_up_on_mm is
+ * safe.
+ */
+ ret = dup_anon_vma(target, next, &anon_dup);
+ if (ret)
+ return ret;
+ } else if (copied_from) {
+ vma_start_write(next);
+
+ /*
+ * We are copying from a VMA (i.e. mremap()'ing) to
+ * next, and thus must ensure that either anon_vma's are
+ * already compatible (in which case this call is a nop)
+ * or all anon_vma state is propagated to next
+ */
+ ret = dup_anon_vma(next, copied_from, &anon_dup);
+ if (ret)
+ return ret;
+ } else {
+ /* In no other case may the anon_vma differ. */
+ VM_WARN_ON_VMG(target->anon_vma != next->anon_vma, vmg);
+ }
}
/* Not merging but overwriting any part of next is not handled. */
VM_WARN_ON_VMG(next && !remove_next &&
next != target && vmg->end > next->vm_start, vmg);
+ /*
+ * We should only see a copy with next as the target on a new merge
+ * which sets the end to the next of next.
+ */
+ VM_WARN_ON_VMG(target == next && vmg->copied_from &&
+ vmg->end != next->vm_end, vmg);
/* Only handles expanding */
VM_WARN_ON_VMG(target->vm_start < vmg->start ||
target->vm_end > vmg->end, vmg);
@@ -1808,6 +1832,13 @@ struct vm_area_struct *copy_vma(struct v
VMG_VMA_STATE(vmg, &vmi, NULL, vma, addr, addr + len);
/*
+ * VMG_VMA_STATE() installs vma in middle, but this is a new VMA, inform
+ * merging logic correctly.
+ */
+ vmg.copied_from = vma;
+ vmg.middle = NULL;
+
+ /*
* If anonymous vma has not yet been faulted, update new pgoff
* to match new location, to increase its chance of merging.
*/
@@ -1828,7 +1859,6 @@ struct vm_area_struct *copy_vma(struct v
if (new_vma && new_vma->vm_start < addr + len)
return NULL; /* should never get here */
- vmg.middle = NULL; /* New VMA range. */
vmg.pgoff = pgoff;
vmg.next = vma_iter_next_rewind(&vmi, NULL);
new_vma = vma_merge_new_range(&vmg);
--- a/mm/vma.h~mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge
+++ a/mm/vma.h
@@ -106,6 +106,9 @@ struct vma_merge_struct {
struct anon_vma_name *anon_name;
enum vma_merge_state state;
+ /* If we are copying a VMA, which VMA are we copying from? */
+ struct vm_area_struct *copied_from;
+
/* Flags which callers can use to modify merge behaviour: */
/*
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge.patch
The patch below does not apply to the 6.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.18.y
git checkout FETCH_HEAD
git cherry-pick -x 8a0e4bdddd1c998b894d879a1d22f1e745606215
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025122925-victory-numeral-2346@gregkh' --subject-prefix 'PATCH 6.18.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8a0e4bdddd1c998b894d879a1d22f1e745606215 Mon Sep 17 00:00:00 2001
From: Wei Yang <richard.weiyang(a)gmail.com>
Date: Thu, 6 Nov 2025 03:41:55 +0000
Subject: [PATCH] mm/huge_memory: merge uniform_split_supported() and
non_uniform_split_supported()
uniform_split_supported() and non_uniform_split_supported() share
significantly similar logic.
The only functional difference is that uniform_split_supported() includes
an additional check on the requested @new_order.
The reason for this check comes from the following two aspects:
* some file system or swap cache just supports order-0 folio
* the behavioral difference between uniform/non-uniform split
The behavioral difference between uniform split and non-uniform:
* uniform split splits folio directly to @new_order
* non-uniform split creates after-split folios with orders from
folio_order(folio) - 1 to new_order.
This means for non-uniform split or !new_order split we should check the
file system and swap cache respectively.
This commit unifies the logic and merge the two functions into a single
combined helper, removing redundant code and simplifying the split
support checking mechanism.
Link: https://lkml.kernel.org/r/20251106034155.21398-3-richard.weiyang@gmail.com
Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages")
Signed-off-by: Wei Yang <richard.weiyang(a)gmail.com>
Reviewed-by: Zi Yan <ziy(a)nvidia.com>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: "David Hildenbrand (Red Hat)" <david(a)kernel.org>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: Dev Jain <dev.jain(a)arm.com>
Cc: Lance Yang <lance.yang(a)linux.dev>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Nico Pache <npache(a)redhat.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index b74708dc5b5f..19d4a5f52ca2 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -374,10 +374,8 @@ int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list
unsigned int new_order, bool unmapped);
int min_order_for_split(struct folio *folio);
int split_folio_to_list(struct folio *folio, struct list_head *list);
-bool uniform_split_supported(struct folio *folio, unsigned int new_order,
- bool warns);
-bool non_uniform_split_supported(struct folio *folio, unsigned int new_order,
- bool warns);
+bool folio_split_supported(struct folio *folio, unsigned int new_order,
+ enum split_type split_type, bool warns);
int folio_split(struct folio *folio, unsigned int new_order, struct page *page,
struct list_head *list);
@@ -408,7 +406,7 @@ static inline int split_huge_page_to_order(struct page *page, unsigned int new_o
static inline int try_folio_split_to_order(struct folio *folio,
struct page *page, unsigned int new_order)
{
- if (!non_uniform_split_supported(folio, new_order, /* warns= */ false))
+ if (!folio_split_supported(folio, new_order, SPLIT_TYPE_NON_UNIFORM, /* warns= */ false))
return split_huge_page_to_order(&folio->page, new_order);
return folio_split(folio, new_order, page, NULL);
}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 4118f330c55e..d79a4bb363de 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3593,8 +3593,8 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
return 0;
}
-bool non_uniform_split_supported(struct folio *folio, unsigned int new_order,
- bool warns)
+bool folio_split_supported(struct folio *folio, unsigned int new_order,
+ enum split_type split_type, bool warns)
{
if (folio_test_anon(folio)) {
/* order-1 is not supported for anonymous THP. */
@@ -3602,48 +3602,41 @@ bool non_uniform_split_supported(struct folio *folio, unsigned int new_order,
"Cannot split to order-1 folio");
if (new_order == 1)
return false;
- } else if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
- !mapping_large_folio_support(folio->mapping)) {
- /*
- * No split if the file system does not support large folio.
- * Note that we might still have THPs in such mappings due to
- * CONFIG_READ_ONLY_THP_FOR_FS. But in that case, the mapping
- * does not actually support large folios properly.
- */
- VM_WARN_ONCE(warns,
- "Cannot split file folio to non-0 order");
- return false;
- }
-
- /* Only swapping a whole PMD-mapped folio is supported */
- if (folio_test_swapcache(folio)) {
- VM_WARN_ONCE(warns,
- "Cannot split swapcache folio to non-0 order");
- return false;
- }
-
- return true;
-}
-
-/* See comments in non_uniform_split_supported() */
-bool uniform_split_supported(struct folio *folio, unsigned int new_order,
- bool warns)
-{
- if (folio_test_anon(folio)) {
- VM_WARN_ONCE(warns && new_order == 1,
- "Cannot split to order-1 folio");
- if (new_order == 1)
- return false;
- } else if (new_order) {
+ } else if (split_type == SPLIT_TYPE_NON_UNIFORM || new_order) {
if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
!mapping_large_folio_support(folio->mapping)) {
+ /*
+ * We can always split a folio down to a single page
+ * (new_order == 0) uniformly.
+ *
+ * For any other scenario
+ * a) uniform split targeting a large folio
+ * (new_order > 0)
+ * b) any non-uniform split
+ * we must confirm that the file system supports large
+ * folios.
+ *
+ * Note that we might still have THPs in such
+ * mappings, which is created from khugepaged when
+ * CONFIG_READ_ONLY_THP_FOR_FS is enabled. But in that
+ * case, the mapping does not actually support large
+ * folios properly.
+ */
VM_WARN_ONCE(warns,
"Cannot split file folio to non-0 order");
return false;
}
}
- if (new_order && folio_test_swapcache(folio)) {
+ /*
+ * swapcache folio could only be split to order 0
+ *
+ * non-uniform split creates after-split folios with orders from
+ * folio_order(folio) - 1 to new_order, making it not suitable for any
+ * swapcache folio split. Only uniform split to order-0 can be used
+ * here.
+ */
+ if ((split_type == SPLIT_TYPE_NON_UNIFORM || new_order) && folio_test_swapcache(folio)) {
VM_WARN_ONCE(warns,
"Cannot split swapcache folio to non-0 order");
return false;
@@ -3711,11 +3704,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
if (new_order >= old_order)
return -EINVAL;
- if (split_type == SPLIT_TYPE_UNIFORM && !uniform_split_supported(folio, new_order, true))
- return -EINVAL;
-
- if (split_type == SPLIT_TYPE_NON_UNIFORM &&
- !non_uniform_split_supported(folio, new_order, true))
+ if (!folio_split_supported(folio, new_order, split_type, /* warn = */ true))
return -EINVAL;
is_hzp = is_huge_zero_folio(folio);
Synopsys renamed DWC_usb32 IP to DWC_usb4 as of IP version 1.30. No
functional change except checking for the IP_NAME here. The driver will
treat the new IP_NAME as if it's DWC_usb32. Additional features for USB4
will be introduced and checked separately.
Cc: stable(a)vger.kernel.org
Signed-off-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
---
drivers/usb/dwc3/core.c | 2 ++
drivers/usb/dwc3/core.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 96f85eada047..f71b75465a60 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -993,6 +993,8 @@ static bool dwc3_core_is_valid(struct dwc3 *dwc)
reg = dwc3_readl(dwc->regs, DWC3_GSNPSID);
dwc->ip = DWC3_GSNPS_ID(reg);
+ if (dwc->ip == DWC4_IP)
+ dwc->ip = DWC32_IP;
/* This should read as U3 followed by revision number */
if (DWC3_IP_IS(DWC3)) {
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index a5fc92c4ffa3..45757169b672 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1265,6 +1265,7 @@ struct dwc3 {
#define DWC3_IP 0x5533
#define DWC31_IP 0x3331
#define DWC32_IP 0x3332
+#define DWC4_IP 0x3430
u32 revision;
base-commit: 18514fd70ea4ca9de137bb3bceeac1bac4bcad75
--
2.28.0
Hi,
syzbot reported a circular locking dependency in the NET/ROM routing
code involving nr_neigh_list_lock, nr_node_list_lock and
nr_node->node_lock when nr_rt_device_down() interacts with the
ioctl path. This series fixes that deadlock and also addresses a
long-standing reference count leak found while auditing the same
code.
Patch 1/2 refactors nr_rt_device_down() to avoid nested locking
between nr_neigh_list_lock and nr_node_list_lock by doing two
separate passes over nodes and neighbours, and adjusts nr_rt_free()
to follow the same lock ordering.
Patch 2/2 fixes a per-route reference count leak by dropping
nr_neigh->count and calling nr_neigh_put() when removing routes
from nr_rt_device_down(), mirroring the behaviour of
nr_dec_obs()/nr_del_node().
[1] https://syzkaller.appspot.com/bug?extid=14afda08dc3484d5db82
Thanks,
Junjie
Hi stable maintainers,
I have tried backporting some fixes to stable kernel 6.12.y which also
have CVE numbers and are fixing commits in 6.12.y.
I am not a subsystem expert and have only done overall testing that we
do for stable release candidate testing and not any patch specific testing.
Note: All these patches are present backports from upstream.
PATCH 1: The broken commit is in 6.12.y, and the fix is a clean
cherry-pick and addresses CVE-2025-68206
PATCH 2: The broken commit is present in 6.12.y and the fix is a clean
cherry-pick and addresses CVE-2025-40325.
PATCH 3: The broken commit is present in 6.12.y and backport needed a
minor conflict resolution due to missing commit fe69a3918084
("drm/panthor: Fix UAF in panthor_gem_create_with_handle() debugfs
in 6.12.y
PATCH 4,5,6: Patch 4 and 5 are pulled in as prerequisites for PATCH 6 which
is a fix for CVE-2025-40170 and needed a minor conflict resolution due to missing
commit: 22d6c9eebf2e ("net: Unexport shared functions for DCCP.") in 6.12.y
PATCH 7: The broken commit in present in 6.12.y and the backport of the
fix needed a minor conflict resolution due to missing commit in 6.12.y.
This is fix for CVE-2025-40164.
Please let me know if there are any comments.
Regards,
Harshit
Andrii Melnychenko (1):
netfilter: nft_ct: add seqadj extension for natted connections
Boris Brezillon (1):
drm/panthor: Flush shmem writes before mapping buffers CPU-uncached
Eric Dumazet (2):
ipv6: adopt dst_dev() helper
net: use dst_dev_rcu() in sk_setup_caps()
Justin Iurman (1):
net: ipv6: ioam6: use consistent dst names
Xiao Ni (1):
md/raid10: wait barrier before returning discard request with
REQ_NOWAIT
Zqiang (1):
usbnet: Fix using smp_processor_id() in preemptible code warnings
drivers/gpu/drm/panthor/panthor_gem.c | 18 +++++++++++++
drivers/md/raid10.c | 3 +--
drivers/net/usb/usbnet.c | 2 ++
include/net/ip.h | 6 +++--
include/net/ip6_route.h | 4 +--
include/net/route.h | 2 +-
net/core/sock.c | 16 +++++++-----
net/ipv6/exthdrs.c | 2 +-
net/ipv6/icmp.c | 4 ++-
net/ipv6/ila/ila_lwt.c | 2 +-
net/ipv6/ioam6_iptunnel.c | 37 ++++++++++++++-------------
net/ipv6/ip6_gre.c | 8 +++---
net/ipv6/ip6_output.c | 19 +++++++-------
net/ipv6/ip6_tunnel.c | 4 +--
net/ipv6/ip6_udp_tunnel.c | 2 +-
net/ipv6/ip6_vti.c | 2 +-
net/ipv6/ndisc.c | 6 +++--
net/ipv6/netfilter/nf_dup_ipv6.c | 2 +-
net/ipv6/output_core.c | 2 +-
net/ipv6/route.c | 20 +++++++++------
net/ipv6/rpl_iptunnel.c | 4 +--
net/ipv6/seg6_iptunnel.c | 20 ++++++++-------
net/ipv6/seg6_local.c | 2 +-
net/netfilter/nft_ct.c | 5 ++++
24 files changed, 118 insertions(+), 74 deletions(-)
--
2.50.1
On x86_64:
When the second-stage kernel is booted via kexec with a limiting command
line such as "mem=<size>" we observe a pafe fault that happens.
BUG: unable to handle page fault for address: ffff97793ff47000
RIP: ima_restore_measurement_list+0xdc/0x45a
#PF: error_code(0x0000) – not-present page
This happens on x86_64 only, as this is already fixed in aarch64 in
commit: cbf9c4b9617b ("of: check previous kernel's ima-kexec-buffer
against memory bounds")
V1: https://lore.kernel.org/all/20251112193005.3772542-1-harshit.m.mogalapalli@…
V1 attempted to do a similar sanity check in x86_64. Borislav suggested
to add a generic helper ima_validate_range() which could then be used
for both OF based and x86_64.
Testing information:
--------------------
On x86_64: With latest 6.19-rc2 based, we could reproduce the issue, and
patched kernel works fine. (with mem=8G on a 16G memory machine)
Thanks to Yifei for finding enabling IMA_KEXEC is the cause.
Thanks for the reviews on V1.
V1 -> V2:
- Patch 1: Add a generic helper "ima_validate_range()"
- Patch 2: Use this new helper in drivers/of/kexec.c -> No functional
change.
- Patch 3: Fix the page fault by doing sanity check with
"ima_validate_range()"
V2: https://lore.kernel.org/all/20251229081523.622515-1-harshit.m.mogalapalli@o…
V2 -> V3:
Update subject of Patch 1 to more appropriate one (Suggested by Mimi
Zohar)
Thanks,
Harshit
Harshit Mogalapalli (3):
ima: verify the previous kernel's IMA buffer lies in addressable RAM
of/kexec: refactor ima_get_kexec_buffer() to use ima_validate_range()
x86/kexec: Add a sanity check on previous kernel's ima kexec buffer
arch/x86/kernel/setup.c | 6 +++++
drivers/of/kexec.c | 15 +++----------
include/linux/ima.h | 1 +
security/integrity/ima/ima_kexec.c | 35 ++++++++++++++++++++++++++++++
4 files changed, 45 insertions(+), 12 deletions(-)
--
2.50.1
When vm.dirtytime_expire_seconds is set to 0, wakeup_dirtytime_writeback()
schedules delayed work with a delay of 0, causing immediate execution.
The function then reschedules itself with 0 delay again, creating an
infinite busy loop that causes 100% kworker CPU usage.
Fix by:
- Only scheduling delayed work in wakeup_dirtytime_writeback() when
dirtytime_expire_interval is non-zero
- Cancelling the delayed work in dirtytime_interval_handler() when
the interval is set to 0
- Adding a guard in start_dirtytime_writeback() for defensive coding
Tested by booting kernel in QEMU with virtme-ng:
- Before fix: kworker CPU spikes to ~73%
- After fix: CPU remains at normal levels
- Setting interval back to non-zero correctly resumes writeback
Fixes: a2f4870697a5 ("fs: make sure the timestamps for lazytime inodes eventually get written")
Cc: stable(a)vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220227
Signed-off-by: Laveesh Bansal <laveeshb(a)laveeshbansal.com>
---
fs/fs-writeback.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 6800886c4d10..cd21c74cd0e5 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2492,7 +2492,8 @@ static void wakeup_dirtytime_writeback(struct work_struct *w)
wb_wakeup(wb);
}
rcu_read_unlock();
- schedule_delayed_work(&dirtytime_work, dirtytime_expire_interval * HZ);
+ if (dirtytime_expire_interval)
+ schedule_delayed_work(&dirtytime_work, dirtytime_expire_interval * HZ);
}
static int dirtytime_interval_handler(const struct ctl_table *table, int write,
@@ -2501,8 +2502,12 @@ static int dirtytime_interval_handler(const struct ctl_table *table, int write,
int ret;
ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
- if (ret == 0 && write)
- mod_delayed_work(system_percpu_wq, &dirtytime_work, 0);
+ if (ret == 0 && write) {
+ if (dirtytime_expire_interval)
+ mod_delayed_work(system_percpu_wq, &dirtytime_work, 0);
+ else
+ cancel_delayed_work_sync(&dirtytime_work);
+ }
return ret;
}
@@ -2519,7 +2524,8 @@ static const struct ctl_table vm_fs_writeback_table[] = {
static int __init start_dirtytime_writeback(void)
{
- schedule_delayed_work(&dirtytime_work, dirtytime_expire_interval * HZ);
+ if (dirtytime_expire_interval)
+ schedule_delayed_work(&dirtytime_work, dirtytime_expire_interval * HZ);
register_sysctl_init("vm", vm_fs_writeback_table);
return 0;
}
--
2.43.0
From: Laveesh Bansal <laveeshbansal(a)gmail.com>
When vm.dirtytime_expire_seconds is set to 0, wakeup_dirtytime_writeback()
schedules delayed work with a delay of 0, causing immediate execution.
The function then reschedules itself with 0 delay again, creating an
infinite busy loop that causes 100% kworker CPU usage.
Fix by:
- Only scheduling delayed work in wakeup_dirtytime_writeback() when
dirtytime_expire_interval is non-zero
- Cancelling the delayed work in dirtytime_interval_handler() when
the interval is set to 0
- Adding a guard in start_dirtytime_writeback() for defensive coding
Tested by booting kernel in QEMU with virtme-ng:
- Before fix: kworker CPU spikes to ~73%
- After fix: CPU remains at normal levels
- Setting interval back to non-zero correctly resumes writeback
Fixes: a2f4870697a5 ("fs: make sure the timestamps for lazytime inodes eventually get written")
Cc: stable(a)vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220227
Signed-off-by: Laveesh Bansal <laveeshbansal(a)gmail.com>
---
fs/fs-writeback.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 6800886c4d10..cd21c74cd0e5 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2492,7 +2492,8 @@ static void wakeup_dirtytime_writeback(struct work_struct *w)
wb_wakeup(wb);
}
rcu_read_unlock();
- schedule_delayed_work(&dirtytime_work, dirtytime_expire_interval * HZ);
+ if (dirtytime_expire_interval)
+ schedule_delayed_work(&dirtytime_work, dirtytime_expire_interval * HZ);
}
static int dirtytime_interval_handler(const struct ctl_table *table, int write,
@@ -2501,8 +2502,12 @@ static int dirtytime_interval_handler(const struct ctl_table *table, int write,
int ret;
ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
- if (ret == 0 && write)
- mod_delayed_work(system_percpu_wq, &dirtytime_work, 0);
+ if (ret == 0 && write) {
+ if (dirtytime_expire_interval)
+ mod_delayed_work(system_percpu_wq, &dirtytime_work, 0);
+ else
+ cancel_delayed_work_sync(&dirtytime_work);
+ }
return ret;
}
@@ -2519,7 +2524,8 @@ static const struct ctl_table vm_fs_writeback_table[] = {
static int __init start_dirtytime_writeback(void)
{
- schedule_delayed_work(&dirtytime_work, dirtytime_expire_interval * HZ);
+ if (dirtytime_expire_interval)
+ schedule_delayed_work(&dirtytime_work, dirtytime_expire_interval * HZ);
register_sysctl_init("vm", vm_fs_writeback_table);
return 0;
}
--
2.43.0
Hi,
I wanted to see if you had time to review the email I sent earlier.
Should you need any other details, don't hesitate to reach out.
Best
Amy Giles
Please reply with REMOVE if you don't wish to receive further emails
-----Original Message-----
Hi,
Hope you're having a great day!
Are you looking for leads from NRF 2026 Retail's Big Show?
Attendees count: 30,000 Leads
Data Fields: Company Name, Web URL, Contact Name, Title, Direct Email, Phone Number, Mailing Address, Industry, Employee Size, Annual Sales.
If you're interested in these leads, I'd be glad to share the pricing. Let me know!
Thanks for the quick reply. I'm excited to get your thoughts.
Best
Amy Giles
Demand Generation Manager
Zoom Connect, Inc
Please reply with REMOVE if you don't wish to receive further emails