The quilt patch titled
Subject: mm: numa,memblock: include <asm/numa.h> for 'numa_nodes_parsed'
has been removed from the -mm tree. Its filename was
mm-numamemblock-include-asm-numah-for-numa_nodes_parsed.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Ben Dooks <ben.dooks(a)codethink.co.uk>
Subject: mm: numa,memblock: include <asm/numa.h> for 'numa_nodes_parsed'
Date: Thu, 8 Jan 2026 10:15:39 +0000
The 'numa_nodes_parsed' is defined in <asm/numa.h> but this file
is not included in mm/numa_memblks.c (build x86_64) so add this
to the incldues to fix the following sparse warning:
mm/numa_memblks.c:13:12: warning: symbol 'numa_nodes_parsed' was not declared. Should it be static?
Link: https://lkml.kernel.org/r/20260108101539.229192-1-ben.dooks@codethink.co.uk
Fixes: 87482708210f ("mm: introduce numa_memblks")
Signed-off-by: Ben Dooks <ben.dooks(a)codethink.co.uk>
Reviewed-by: Mike Rapoport (Microsoft) <rppt(a)kernel.org>
Cc: Ben Dooks <ben.dooks(a)codethink.co.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/numa_memblks.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/numa_memblks.c~mm-numamemblock-include-asm-numah-for-numa_nodes_parsed
+++ a/mm/numa_memblks.c
@@ -7,6 +7,8 @@
#include <linux/numa.h>
#include <linux/numa_memblks.h>
+#include <asm/numa.h>
+
int numa_distance_cnt;
static u8 *numa_distance;
_
Patches currently in -mm which might be from ben.dooks(a)codethink.co.uk are
The quilt patch titled
Subject: tools/testing/selftests: fix gup_longterm for unknown fs
has been removed from the -mm tree. Its filename was
tools-testing-selftests-fix-gup_longterm-for-unknown-fs.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: tools/testing/selftests: fix gup_longterm for unknown fs
Date: Tue, 6 Jan 2026 15:45:47 +0000
Commit 66bce7afbaca ("selftests/mm: fix test result reporting in
gup_longterm") introduced a small bug causing unknown filesystems to
always result in a test failure.
This is because do_test() was updated to use a common reporting path, but
this case appears to have been missed.
This is problematic for e.g. virtme-ng which uses an overlayfs file
system, causing gup_longterm to appear to fail each time due to a test
count mismatch:
# Planned tests != run tests (50 != 46)
# Totals: pass:24 fail:0 xfail:0 xpass:0 skip:22 error:0
The fix is to simply change the return into a break.
Link: https://lkml.kernel.org/r/20260106154547.214907-1-lorenzo.stoakes@oracle.com
Fixes: 66bce7afbaca ("selftests/mm: fix test result reporting in gup_longterm")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reviewed-by: David Hildenbrand (Red Hat) <david(a)kernel.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/gup_longterm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/mm/gup_longterm.c~tools-testing-selftests-fix-gup_longterm-for-unknown-fs
+++ a/tools/testing/selftests/mm/gup_longterm.c
@@ -179,7 +179,7 @@ static void do_test(int fd, size_t size,
if (rw && shared && fs_is_unknown(fs_type)) {
ksft_print_msg("Unknown filesystem\n");
result = KSFT_SKIP;
- return;
+ break;
}
/*
* R/O pinning or pinning in a private mapping is always
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-vma-do-not-leak-memory-when-mmap_prepare-swaps-the-file.patch
mm-rmap-improve-anon_vma_clone-unlink_anon_vmas-comments-add-asserts.patch
mm-rmap-skip-unfaulted-vmas-on-anon_vma-clone-unlink.patch
mm-rmap-remove-unnecessary-root-lock-dance-in-anon_vma-clone-unmap.patch
mm-rmap-remove-anon_vma_merge-function.patch
mm-rmap-make-anon_vma-functions-internal.patch
mm-mmap_lock-add-vma_is_attached-helper.patch
mm-rmap-allocate-anon_vma_chain-objects-unlocked-when-possible.patch
mm-rmap-allocate-anon_vma_chain-objects-unlocked-when-possible-fix.patch
mm-rmap-separate-out-fork-only-logic-on-anon_vma_clone.patch
mm-rmap-separate-out-fork-only-logic-on-anon_vma_clone-fix.patch
The quilt patch titled
Subject: mm/page_alloc: prevent pcp corruption with SMP=n
has been removed from the -mm tree. Its filename was
mm-page_alloc-prevent-pcp-corruption-with-smp=n.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: mm/page_alloc: prevent pcp corruption with SMP=n
Date: Mon, 05 Jan 2026 16:08:56 +0100
The kernel test robot has reported:
BUG: spinlock trylock failure on UP on CPU#0, kcompactd0/28
lock: 0xffff888807e35ef0, .magic: dead4ead, .owner: kcompactd0/28, .owner_cpu: 0
CPU: 0 UID: 0 PID: 28 Comm: kcompactd0 Not tainted 6.18.0-rc5-00127-ga06157804399 #1 PREEMPT 8cc09ef94dcec767faa911515ce9e609c45db470
Call Trace:
<IRQ>
__dump_stack (lib/dump_stack.c:95)
dump_stack_lvl (lib/dump_stack.c:123)
dump_stack (lib/dump_stack.c:130)
spin_dump (kernel/locking/spinlock_debug.c:71)
do_raw_spin_trylock (kernel/locking/spinlock_debug.c:?)
_raw_spin_trylock (include/linux/spinlock_api_smp.h:89 kernel/locking/spinlock.c:138)
__free_frozen_pages (mm/page_alloc.c:2973)
___free_pages (mm/page_alloc.c:5295)
__free_pages (mm/page_alloc.c:5334)
tlb_remove_table_rcu (include/linux/mm.h:? include/linux/mm.h:3122 include/asm-generic/tlb.h:220 mm/mmu_gather.c:227 mm/mmu_gather.c:290)
? __cfi_tlb_remove_table_rcu (mm/mmu_gather.c:289)
? rcu_core (kernel/rcu/tree.c:?)
rcu_core (include/linux/rcupdate.h:341 kernel/rcu/tree.c:2607 kernel/rcu/tree.c:2861)
rcu_core_si (kernel/rcu/tree.c:2879)
handle_softirqs (arch/x86/include/asm/jump_label.h:36 include/trace/events/irq.h:142 kernel/softirq.c:623)
__irq_exit_rcu (arch/x86/include/asm/jump_label.h:36 kernel/softirq.c:725)
irq_exit_rcu (kernel/softirq.c:741)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1052)
</IRQ>
<TASK>
RIP: 0010:_raw_spin_unlock_irqrestore (arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194)
free_pcppages_bulk (mm/page_alloc.c:1494)
drain_pages_zone (include/linux/spinlock.h:391 mm/page_alloc.c:2632)
__drain_all_pages (mm/page_alloc.c:2731)
drain_all_pages (mm/page_alloc.c:2747)
kcompactd (mm/compaction.c:3115)
kthread (kernel/kthread.c:465)
? __cfi_kcompactd (mm/compaction.c:3166)
? __cfi_kthread (kernel/kthread.c:412)
ret_from_fork (arch/x86/kernel/process.c:164)
? __cfi_kthread (kernel/kthread.c:412)
ret_from_fork_asm (arch/x86/entry/entry_64.S:255)
</TASK>
Matthew has analyzed the report and identified that in drain_page_zone()
we are in a section protected by spin_lock(&pcp->lock) and then get an
interrupt that attempts spin_trylock() on the same lock. The code is
designed to work this way without disabling IRQs and occasionally fail the
trylock with a fallback. However, the SMP=n spinlock implementation
assumes spin_trylock() will always succeed, and thus it's normally a
no-op. Here the enabled lock debugging catches the problem, but otherwise
it could cause a corruption of the pcp structure.
The problem has been introduced by commit 574907741599 ("mm/page_alloc:
leave IRQs enabled for per-cpu page allocations"). The pcp locking scheme
recognizes the need for disabling IRQs to prevent nesting spin_trylock()
sections on SMP=n, but the need to prevent the nesting in spin_lock() has
not been recognized. Fix it by introducing local wrappers that change the
spin_lock() to spin_lock_iqsave() with SMP=n and use them in all places
that do spin_lock(&pcp->lock).
[vbabka(a)suse.cz: add pcp_ prefix to the spin_lock_irqsave wrappers, per Steven]
Link: https://lkml.kernel.org/r/20260105-fix-pcp-up-v1-1-5579662d2071@suse.cz
Fixes: 574907741599 ("mm/page_alloc: leave IRQs enabled for per-cpu page allocations")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Closes: https://lore.kernel.org/oe-lkp/202512101320.e2f2dd6f-lkp@intel.com
Analyzed-by: Matthew Wilcox <willy(a)infradead.org>
Link: https://lore.kernel.org/all/aUW05pyc9nZkvY-1@casper.infradead.org/
Acked-by: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Brendan Jackman <jackmanb(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 47 ++++++++++++++++++++++++++++++++++++++--------
1 file changed, 39 insertions(+), 8 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-prevent-pcp-corruption-with-smp=n
+++ a/mm/page_alloc.c
@@ -167,6 +167,33 @@ static inline void __pcp_trylock_noop(un
pcp_trylock_finish(UP_flags); \
})
+/*
+ * With the UP spinlock implementation, when we spin_lock(&pcp->lock) (for i.e.
+ * a potentially remote cpu drain) and get interrupted by an operation that
+ * attempts pcp_spin_trylock(), we can't rely on the trylock failure due to UP
+ * spinlock assumptions making the trylock a no-op. So we have to turn that
+ * spin_lock() to a spin_lock_irqsave(). This works because on UP there are no
+ * remote cpu's so we can only be locking the only existing local one.
+ */
+#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)
+static inline void __flags_noop(unsigned long *flags) { }
+#define pcp_spin_lock_maybe_irqsave(ptr, flags) \
+({ \
+ __flags_noop(&(flags)); \
+ spin_lock(&(ptr)->lock); \
+})
+#define pcp_spin_unlock_maybe_irqrestore(ptr, flags) \
+({ \
+ spin_unlock(&(ptr)->lock); \
+ __flags_noop(&(flags)); \
+})
+#else
+#define pcp_spin_lock_maybe_irqsave(ptr, flags) \
+ spin_lock_irqsave(&(ptr)->lock, flags)
+#define pcp_spin_unlock_maybe_irqrestore(ptr, flags) \
+ spin_unlock_irqrestore(&(ptr)->lock, flags)
+#endif
+
#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
DEFINE_PER_CPU(int, numa_node);
EXPORT_PER_CPU_SYMBOL(numa_node);
@@ -2556,6 +2583,7 @@ static int rmqueue_bulk(struct zone *zon
bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp)
{
int high_min, to_drain, to_drain_batched, batch;
+ unsigned long UP_flags;
bool todo = false;
high_min = READ_ONCE(pcp->high_min);
@@ -2575,9 +2603,9 @@ bool decay_pcp_high(struct zone *zone, s
to_drain = pcp->count - pcp->high;
while (to_drain > 0) {
to_drain_batched = min(to_drain, batch);
- spin_lock(&pcp->lock);
+ pcp_spin_lock_maybe_irqsave(pcp, UP_flags);
free_pcppages_bulk(zone, to_drain_batched, pcp, 0);
- spin_unlock(&pcp->lock);
+ pcp_spin_unlock_maybe_irqrestore(pcp, UP_flags);
todo = true;
to_drain -= to_drain_batched;
@@ -2594,14 +2622,15 @@ bool decay_pcp_high(struct zone *zone, s
*/
void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
{
+ unsigned long UP_flags;
int to_drain, batch;
batch = READ_ONCE(pcp->batch);
to_drain = min(pcp->count, batch);
if (to_drain > 0) {
- spin_lock(&pcp->lock);
+ pcp_spin_lock_maybe_irqsave(pcp, UP_flags);
free_pcppages_bulk(zone, to_drain, pcp, 0);
- spin_unlock(&pcp->lock);
+ pcp_spin_unlock_maybe_irqrestore(pcp, UP_flags);
}
}
#endif
@@ -2612,10 +2641,11 @@ void drain_zone_pages(struct zone *zone,
static void drain_pages_zone(unsigned int cpu, struct zone *zone)
{
struct per_cpu_pages *pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
+ unsigned long UP_flags;
int count;
do {
- spin_lock(&pcp->lock);
+ pcp_spin_lock_maybe_irqsave(pcp, UP_flags);
count = pcp->count;
if (count) {
int to_drain = min(count,
@@ -2624,7 +2654,7 @@ static void drain_pages_zone(unsigned in
free_pcppages_bulk(zone, to_drain, pcp, 0);
count -= to_drain;
}
- spin_unlock(&pcp->lock);
+ pcp_spin_unlock_maybe_irqrestore(pcp, UP_flags);
} while (count);
}
@@ -6109,6 +6139,7 @@ static void zone_pcp_update_cacheinfo(st
{
struct per_cpu_pages *pcp;
struct cpu_cacheinfo *cci;
+ unsigned long UP_flags;
pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
cci = get_cpu_cacheinfo(cpu);
@@ -6119,12 +6150,12 @@ static void zone_pcp_update_cacheinfo(st
* This can reduce zone lock contention without hurting
* cache-hot pages sharing.
*/
- spin_lock(&pcp->lock);
+ pcp_spin_lock_maybe_irqsave(pcp, UP_flags);
if ((cci->per_cpu_data_slice_size >> PAGE_SHIFT) > 3 * pcp->batch)
pcp->flags |= PCPF_FREE_HIGH_BATCH;
else
pcp->flags &= ~PCPF_FREE_HIGH_BATCH;
- spin_unlock(&pcp->lock);
+ pcp_spin_unlock_maybe_irqrestore(pcp, UP_flags);
}
void setup_pcp_cacheinfo(unsigned int cpu)
_
Patches currently in -mm which might be from vbabka(a)suse.cz are
mm-page_alloc-thp-prevent-reclaim-for-__gfp_thisnode-thp-allocations.patch
mm-page_alloc-ignore-the-exact-initial-compaction-result.patch
mm-page_alloc-refactor-the-initial-compaction-handling.patch
mm-page_alloc-simplify-__alloc_pages_slowpath-flow.patch
The quilt patch titled
Subject: mm: kmsan: fix poisoning of high-order non-compound pages
has been removed from the -mm tree. Its filename was
mm-kmsan-fix-poisoning-of-high-order-non-compound-pages.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Ryan Roberts <ryan.roberts(a)arm.com>
Subject: mm: kmsan: fix poisoning of high-order non-compound pages
Date: Sun, 4 Jan 2026 13:43:47 +0000
kmsan_free_page() is called by the page allocator's free_pages_prepare()
during page freeing. Its job is to poison all the memory covered by the
page. It can be called with an order-0 page, a compound high-order page
or a non-compound high-order page. But page_size() only works for order-0
and compound pages. For a non-compound high-order page it will
incorrectly return PAGE_SIZE.
The implication is that the tail pages of a high-order non-compound page
do not get poisoned at free, so any invalid access while they are free
could go unnoticed. It looks like the pages will be poisoned again at
allocation time, so that would bookend the window.
Fix this by using the order parameter to calculate the size.
Link: https://lkml.kernel.org/r/20260104134348.3544298-1-ryan.roberts@arm.com
Fixes: b073d7f8aee4 ("mm: kmsan: maintain KMSAN metadata for page operations")
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
Reviewed-by: Alexander Potapenko <glider(a)google.com>
Tested-by: Alexander Potapenko <glider(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kmsan/shadow.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/kmsan/shadow.c~mm-kmsan-fix-poisoning-of-high-order-non-compound-pages
+++ a/mm/kmsan/shadow.c
@@ -207,7 +207,7 @@ void kmsan_free_page(struct page *page,
if (!kmsan_enabled || kmsan_in_runtime())
return;
kmsan_enter_runtime();
- kmsan_internal_poison_memory(page_address(page), page_size(page),
+ kmsan_internal_poison_memory(page_address(page), PAGE_SIZE << order,
GFP_KERNEL & ~(__GFP_RECLAIM),
KMSAN_POISON_CHECK | KMSAN_POISON_FREE);
kmsan_leave_runtime();
_
Patches currently in -mm which might be from ryan.roberts(a)arm.com are
The quilt patch titled
Subject: mm/vma: enforce VMA fork limit on unfaulted,faulted mremap merge too
has been removed from the -mm tree. Its filename was
mm-vma-enforce-vma-fork-limit-on-unfaultedfaulted-mremap-merge-too.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm/vma: enforce VMA fork limit on unfaulted,faulted mremap merge too
Date: Mon, 5 Jan 2026 20:11:49 +0000
The is_mergeable_anon_vma() function uses vmg->middle as the source VMA.
However when merging a new VMA, this field is NULL.
In all cases except mremap(), the new VMA will either be newly established
and thus lack an anon_vma, or will be an expansion of an existing VMA thus
we do not care about whether VMA is CoW'd or not.
In the case of an mremap(), we can end up in a situation where we can
accidentally allow an unfaulted/faulted merge with a VMA that has been
forked, violating the general rule that we do not permit this for reasons
of anon_vma lock scalability.
Now we have the ability to be aware of the fact we are copying a VMA and
also know which VMA that is, we can explicitly check for this, so do so.
This is pertinent since commit 879bca0a2c4f ("mm/vma: fix incorrectly
disallowed anonymous VMA merges"), as this patch permits unfaulted/faulted
merges that were previously disallowed running afoul of this issue.
While we are here, vma_had_uncowed_parents() is a confusing name, so make
it simple and rename it to vma_is_fork_child().
Link: https://lkml.kernel.org/r/6e2b9b3024ae1220961c8b81d74296d4720eaf2b.17676382…
Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reviewed-by: Harry Yoo <harry.yoo(a)oracle.com>
Reviewed-by: Jeongjun Park <aha310510(a)gmail.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: David Hildenbrand (Red Hat) <david(a)kernel.org>
Cc: Jann Horn <jannh(a)google.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Pedro Falcato <pfalcato(a)suse.de>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Yeoreum Yun <yeoreum.yun(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vma.c | 27 +++++++++++++++------------
1 file changed, 15 insertions(+), 12 deletions(-)
--- a/mm/vma.c~mm-vma-enforce-vma-fork-limit-on-unfaultedfaulted-mremap-merge-too
+++ a/mm/vma.c
@@ -67,18 +67,13 @@ struct mmap_state {
.state = VMA_MERGE_START, \
}
-/*
- * If, at any point, the VMA had unCoW'd mappings from parents, it will maintain
- * more than one anon_vma_chain connecting it to more than one anon_vma. A merge
- * would mean a wider range of folios sharing the root anon_vma lock, and thus
- * potential lock contention, we do not wish to encourage merging such that this
- * scales to a problem.
- */
-static bool vma_had_uncowed_parents(struct vm_area_struct *vma)
+/* Was this VMA ever forked from a parent, i.e. maybe contains CoW mappings? */
+static bool vma_is_fork_child(struct vm_area_struct *vma)
{
/*
* The list_is_singular() test is to avoid merging VMA cloned from
- * parents. This can improve scalability caused by anon_vma lock.
+ * parents. This can improve scalability caused by the anon_vma root
+ * lock.
*/
return vma && vma->anon_vma && !list_is_singular(&vma->anon_vma_chain);
}
@@ -115,11 +110,19 @@ static bool is_mergeable_anon_vma(struct
VM_WARN_ON(src && src_anon != src->anon_vma);
/* Case 1 - we will dup_anon_vma() from src into tgt. */
- if (!tgt_anon && src_anon)
- return !vma_had_uncowed_parents(src);
+ if (!tgt_anon && src_anon) {
+ struct vm_area_struct *copied_from = vmg->copied_from;
+
+ if (vma_is_fork_child(src))
+ return false;
+ if (vma_is_fork_child(copied_from))
+ return false;
+
+ return true;
+ }
/* Case 2 - we will simply use tgt's anon_vma. */
if (tgt_anon && !src_anon)
- return !vma_had_uncowed_parents(tgt);
+ return !vma_is_fork_child(tgt);
/* Case 3 - the anon_vma's are already shared. */
return src_anon == tgt_anon;
}
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-vma-do-not-leak-memory-when-mmap_prepare-swaps-the-file.patch
mm-rmap-improve-anon_vma_clone-unlink_anon_vmas-comments-add-asserts.patch
mm-rmap-skip-unfaulted-vmas-on-anon_vma-clone-unlink.patch
mm-rmap-remove-unnecessary-root-lock-dance-in-anon_vma-clone-unmap.patch
mm-rmap-remove-anon_vma_merge-function.patch
mm-rmap-make-anon_vma-functions-internal.patch
mm-mmap_lock-add-vma_is_attached-helper.patch
mm-rmap-allocate-anon_vma_chain-objects-unlocked-when-possible.patch
mm-rmap-allocate-anon_vma_chain-objects-unlocked-when-possible-fix.patch
mm-rmap-separate-out-fork-only-logic-on-anon_vma_clone.patch
mm-rmap-separate-out-fork-only-logic-on-anon_vma_clone-fix.patch
The quilt patch titled
Subject: mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge
has been removed from the -mm tree. Its filename was
mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge
Date: Mon, 5 Jan 2026 20:11:47 +0000
Patch series "mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted
merge", v2.
Commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA
merges") introduced the ability to merge previously unavailable VMA merge
scenarios.
However, it is handling merges incorrectly when it comes to mremap() of a
faulted VMA adjacent to an unfaulted VMA. The issues arise in three
cases:
1. Previous VMA unfaulted:
copied -----|
v
|-----------|.............|
| unfaulted |(faulted VMA)|
|-----------|.............|
prev
2. Next VMA unfaulted:
copied -----|
v
|.............|-----------|
|(faulted VMA)| unfaulted |
|.............|-----------|
next
3. Both adjacent VMAs unfaulted:
copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| unfaulted |
|-----------|.............|-----------|
prev next
This series fixes each of these cases, and introduces self tests to assert
that the issues are corrected.
I also test a further case which was already handled, to assert that my
changes continues to correctly handle it:
4. prev unfaulted, next faulted:
copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| faulted |
|-----------|.............|-----------|
prev next
This bug was discovered via a syzbot report, linked to in the first patch
in the series, I confirmed that this series fixes the bug.
I also discovered that we are failing to check that the faulted VMA was
not forked when merging a copied VMA in cases 1-3 above, an issue this
series also addresses.
I also added self tests to assert that this is resolved (and confirmed
that the tests failed prior to this).
I also cleaned up vma_expand() as part of this work, renamed
vma_had_uncowed_parents() to vma_is_fork_child() as the previous name was
unduly confusing, and simplified the comments around this function.
This patch (of 4):
Commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA
merges") introduced the ability to merge previously unavailable VMA merge
scenarios.
The key piece of logic introduced was the ability to merge a faulted VMA
immediately next to an unfaulted VMA, which relies upon dup_anon_vma() to
correctly handle anon_vma state.
In the case of the merge of an existing VMA (that is changing properties
of a VMA and then merging if those properties are shared by adjacent
VMAs), dup_anon_vma() is invoked correctly.
However in the case of the merge of a new VMA, a corner case peculiar to
mremap() was missed.
The issue is that vma_expand() only performs dup_anon_vma() if the target
(the VMA that will ultimately become the merged VMA): is not the next VMA,
i.e. the one that appears after the range in which the new VMA is to be
established.
A key insight here is that in all other cases other than mremap(), a new
VMA merge either expands an existing VMA, meaning that the target VMA will
be that VMA, or would have anon_vma be NULL.
Specifically:
* __mmap_region() - no anon_vma in place, initial mapping.
* do_brk_flags() - expanding an existing VMA.
* vma_merge_extend() - expanding an existing VMA.
* relocate_vma_down() - no anon_vma in place, initial mapping.
In addition, we are in the unique situation of needing to duplicate
anon_vma state from a VMA that is neither the previous or next VMA being
merged with.
dup_anon_vma() deals exclusively with the target=unfaulted, src=faulted
case. This leaves four possibilities, in each case where the copied VMA
is faulted:
1. Previous VMA unfaulted:
copied -----|
v
|-----------|.............|
| unfaulted |(faulted VMA)|
|-----------|.............|
prev
target = prev, expand prev to cover.
2. Next VMA unfaulted:
copied -----|
v
|.............|-----------|
|(faulted VMA)| unfaulted |
|.............|-----------|
next
target = next, expand next to cover.
3. Both adjacent VMAs unfaulted:
copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| unfaulted |
|-----------|.............|-----------|
prev next
target = prev, expand prev to cover.
4. prev unfaulted, next faulted:
copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| faulted |
|-----------|.............|-----------|
prev next
target = prev, expand prev to cover. Essentially equivalent to 3, but
with additional requirement that next's anon_vma is the same as the copied
VMA's. This is covered by the existing logic.
To account for this very explicitly, we introduce
vma_merge_copied_range(), which sets a newly introduced vmg->copied_from
field, then invokes vma_merge_new_range() which handles the rest of the
logic.
We then update the key vma_expand() function to clean up the logic and
make what's going on clearer, making the 'remove next' case less special,
before invoking dup_anon_vma() unconditionally should we be copying from a
VMA.
Note that in case 3, the if (remove_next) ... branch will be a no-op, as
next=src in this instance and src is unfaulted.
In case 4, it won't be, but since in this instance next=src and it is
faulted, this will have required tgt=faulted, src=faulted to be
compatible, meaning that next->anon_vma == vmg->copied_from->anon_vma, and
thus a single dup_anon_vma() of next suffices to copy anon_vma state for
the copied-from VMA also.
If we are copying from a VMA in a successful merge we must _always_
propagate anon_vma state.
This issue can be observed most directly by invoked mremap() to move
around a VMA and cause this kind of merge with the MREMAP_DONTUNMAP flag
specified.
This will result in unlink_anon_vmas() being called after failing to
duplicate anon_vma state to the target VMA, which results in the anon_vma
itself being freed with folios still possessing dangling pointers to the
anon_vma and thus a use-after-free bug.
This bug was discovered via a syzbot report, which this patch resolves.
We further make a change to update the mergeable anon_vma check to assert
the copied-from anon_vma did not have CoW parents, as otherwise
dup_anon_vma() might incorrectly propagate CoW ancestors from the next VMA
in case 4 despite the anon_vma's being identical for both VMAs.
Link: https://lkml.kernel.org/r/cover.1767638272.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/b7930ad2b1503a657e29fe928eb33061d7eadf5b.17676382…
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges")
Reported-by: syzbot+b165fc2e11771c66d8ba(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/694a2745.050a0220.19928e.0017.GAE@google.com/
Reported-by: syzbot+5272541ccbbb14e2ec30(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/694e3dc6.050a0220.35954c.0066.GAE@google.com/
Reviewed-by: Harry Yoo <harry.yoo(a)oracle.com>
Reviewed-by: Jeongjun Park <aha310510(a)gmail.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: David Hildenbrand (Red Hat) <david(a)kernel.org>
Cc: Jann Horn <jannh(a)google.com>
Cc: Yeoreum Yun <yeoreum.yun(a)arm.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Cc: Pedro Falcato <pfalcato(a)suse.de>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vma.c | 84 +++++++++++++++++++++++++++++++++++++----------------
mm/vma.h | 3 +
2 files changed, 62 insertions(+), 25 deletions(-)
--- a/mm/vma.c~mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge
+++ a/mm/vma.c
@@ -829,6 +829,8 @@ static __must_check struct vm_area_struc
VM_WARN_ON_VMG(middle &&
!(vma_iter_addr(vmg->vmi) >= middle->vm_start &&
vma_iter_addr(vmg->vmi) < middle->vm_end), vmg);
+ /* An existing merge can never be used by the mremap() logic. */
+ VM_WARN_ON_VMG(vmg->copied_from, vmg);
vmg->state = VMA_MERGE_NOMERGE;
@@ -1099,6 +1101,33 @@ struct vm_area_struct *vma_merge_new_ran
}
/*
+ * vma_merge_copied_range - Attempt to merge a VMA that is being copied by
+ * mremap()
+ *
+ * @vmg: Describes the VMA we are adding, in the copied-to range @vmg->start to
+ * @vmg->end (exclusive), which we try to merge with any adjacent VMAs if
+ * possible.
+ *
+ * vmg->prev, next, start, end, pgoff should all be relative to the COPIED TO
+ * range, i.e. the target range for the VMA.
+ *
+ * Returns: In instances where no merge was possible, NULL. Otherwise, a pointer
+ * to the VMA we expanded.
+ *
+ * ASSUMPTIONS: Same as vma_merge_new_range(), except vmg->middle must contain
+ * the copied-from VMA.
+ */
+static struct vm_area_struct *vma_merge_copied_range(struct vma_merge_struct *vmg)
+{
+ /* We must have a copied-from VMA. */
+ VM_WARN_ON_VMG(!vmg->middle, vmg);
+
+ vmg->copied_from = vmg->middle;
+ vmg->middle = NULL;
+ return vma_merge_new_range(vmg);
+}
+
+/*
* vma_expand - Expand an existing VMA
*
* @vmg: Describes a VMA expansion operation.
@@ -1117,46 +1146,52 @@ struct vm_area_struct *vma_merge_new_ran
int vma_expand(struct vma_merge_struct *vmg)
{
struct vm_area_struct *anon_dup = NULL;
- bool remove_next = false;
struct vm_area_struct *target = vmg->target;
struct vm_area_struct *next = vmg->next;
+ bool remove_next = false;
vm_flags_t sticky_flags;
-
- sticky_flags = vmg->vm_flags & VM_STICKY;
- sticky_flags |= target->vm_flags & VM_STICKY;
-
- VM_WARN_ON_VMG(!target, vmg);
+ int ret = 0;
mmap_assert_write_locked(vmg->mm);
-
vma_start_write(target);
- if (next && (target != next) && (vmg->end == next->vm_end)) {
- int ret;
- sticky_flags |= next->vm_flags & VM_STICKY;
+ if (next && target != next && vmg->end == next->vm_end)
remove_next = true;
- /* This should already have been checked by this point. */
- VM_WARN_ON_VMG(!can_merge_remove_vma(next), vmg);
- vma_start_write(next);
- /*
- * In this case we don't report OOM, so vmg->give_up_on_mm is
- * safe.
- */
- ret = dup_anon_vma(target, next, &anon_dup);
- if (ret)
- return ret;
- }
+ /* We must have a target. */
+ VM_WARN_ON_VMG(!target, vmg);
+ /* This should have already been checked by this point. */
+ VM_WARN_ON_VMG(remove_next && !can_merge_remove_vma(next), vmg);
/* Not merging but overwriting any part of next is not handled. */
VM_WARN_ON_VMG(next && !remove_next &&
next != target && vmg->end > next->vm_start, vmg);
- /* Only handles expanding */
+ /* Only handles expanding. */
VM_WARN_ON_VMG(target->vm_start < vmg->start ||
target->vm_end > vmg->end, vmg);
+ sticky_flags = vmg->vm_flags & VM_STICKY;
+ sticky_flags |= target->vm_flags & VM_STICKY;
if (remove_next)
- vmg->__remove_next = true;
+ sticky_flags |= next->vm_flags & VM_STICKY;
+
+ /*
+ * If we are removing the next VMA or copying from a VMA
+ * (e.g. mremap()'ing), we must propagate anon_vma state.
+ *
+ * Note that, by convention, callers ignore OOM for this case, so
+ * we don't need to account for vmg->give_up_on_mm here.
+ */
+ if (remove_next)
+ ret = dup_anon_vma(target, next, &anon_dup);
+ if (!ret && vmg->copied_from)
+ ret = dup_anon_vma(target, vmg->copied_from, &anon_dup);
+ if (ret)
+ return ret;
+ if (remove_next) {
+ vma_start_write(next);
+ vmg->__remove_next = true;
+ }
if (commit_merge(vmg))
goto nomem;
@@ -1828,10 +1863,9 @@ struct vm_area_struct *copy_vma(struct v
if (new_vma && new_vma->vm_start < addr + len)
return NULL; /* should never get here */
- vmg.middle = NULL; /* New VMA range. */
vmg.pgoff = pgoff;
vmg.next = vma_iter_next_rewind(&vmi, NULL);
- new_vma = vma_merge_new_range(&vmg);
+ new_vma = vma_merge_copied_range(&vmg);
if (new_vma) {
/*
--- a/mm/vma.h~mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge
+++ a/mm/vma.h
@@ -106,6 +106,9 @@ struct vma_merge_struct {
struct anon_vma_name *anon_name;
enum vma_merge_state state;
+ /* If copied from (i.e. mremap()'d) the VMA from which we are copying. */
+ struct vm_area_struct *copied_from;
+
/* Flags which callers can use to modify merge behaviour: */
/*
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-vma-do-not-leak-memory-when-mmap_prepare-swaps-the-file.patch
mm-rmap-improve-anon_vma_clone-unlink_anon_vmas-comments-add-asserts.patch
mm-rmap-skip-unfaulted-vmas-on-anon_vma-clone-unlink.patch
mm-rmap-remove-unnecessary-root-lock-dance-in-anon_vma-clone-unmap.patch
mm-rmap-remove-anon_vma_merge-function.patch
mm-rmap-make-anon_vma-functions-internal.patch
mm-mmap_lock-add-vma_is_attached-helper.patch
mm-rmap-allocate-anon_vma_chain-objects-unlocked-when-possible.patch
mm-rmap-allocate-anon_vma_chain-objects-unlocked-when-possible-fix.patch
mm-rmap-separate-out-fork-only-logic-on-anon_vma_clone.patch
mm-rmap-separate-out-fork-only-logic-on-anon_vma_clone-fix.patch
The quilt patch titled
Subject: mm/zswap: fix error pointer free in zswap_cpu_comp_prepare()
has been removed from the -mm tree. Its filename was
mm-zswap-fix-error-pointer-free-in-zswap_cpu_comp_prepare.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Pavel Butsykin <pbutsykin(a)cloudlinux.com>
Subject: mm/zswap: fix error pointer free in zswap_cpu_comp_prepare()
Date: Wed, 31 Dec 2025 11:46:38 +0400
crypto_alloc_acomp_node() may return ERR_PTR(), but the fail path checks
only for NULL and can pass an error pointer to crypto_free_acomp(). Use
IS_ERR_OR_NULL() to only free valid acomp instances.
Link: https://lkml.kernel.org/r/20251231074638.2564302-1-pbutsykin@cloudlinux.com
Fixes: 779b9955f643 ("mm: zswap: move allocations during CPU init outside the lock")
Signed-off-by: Pavel Butsykin <pbutsykin(a)cloudlinux.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Acked-by: Yosry Ahmed <yosry.ahmed(a)linux.dev>
Acked-by: Nhat Pham <nphamcs(a)gmail.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Chengming Zhou <chengming.zhou(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/zswap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/zswap.c~mm-zswap-fix-error-pointer-free-in-zswap_cpu_comp_prepare
+++ a/mm/zswap.c
@@ -787,7 +787,7 @@ static int zswap_cpu_comp_prepare(unsign
return 0;
fail:
- if (acomp)
+ if (!IS_ERR_OR_NULL(acomp))
crypto_free_acomp(acomp);
kfree(buffer);
return ret;
_
Patches currently in -mm which might be from pbutsykin(a)cloudlinux.com are
The quilt patch titled
Subject: mm/damon/sysfs-scheme: cleanup access_pattern subdirs on scheme dir setup failure
has been removed from the -mm tree. Its filename was
mm-damon-sysfs-scheme-cleanup-access_pattern-subdirs-on-scheme-dir-setup-failure.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/sysfs-scheme: cleanup access_pattern subdirs on scheme dir setup failure
Date: Wed, 24 Dec 2025 18:30:37 -0800
When a DAMOS-scheme DAMON sysfs directory setup fails after setup of
access_pattern/ directory, subdirectories of access_pattern/ directory are
not cleaned up. As a result, DAMON sysfs interface is nearly broken until
the system reboots, and the memory for the unremoved directory is leaked.
Cleanup the directories under such failures.
Link: https://lkml.kernel.org/r/20251225023043.18579-5-sj@kernel.org
Fixes: 9bbb820a5bd5 ("mm/damon/sysfs: support DAMOS quotas")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: chongjiapeng <jiapeng.chong(a)linux.alibaba.com>
Cc: <stable(a)vger.kernel.org> # 5.18.x
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs-schemes.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/damon/sysfs-schemes.c~mm-damon-sysfs-scheme-cleanup-access_pattern-subdirs-on-scheme-dir-setup-failure
+++ a/mm/damon/sysfs-schemes.c
@@ -2152,7 +2152,7 @@ static int damon_sysfs_scheme_add_dirs(s
return err;
err = damos_sysfs_set_dests(scheme);
if (err)
- goto put_access_pattern_out;
+ goto rmdir_put_access_pattern_out;
err = damon_sysfs_scheme_set_quotas(scheme);
if (err)
goto put_dests_out;
@@ -2190,7 +2190,8 @@ rmdir_put_quotas_access_pattern_out:
put_dests_out:
kobject_put(&scheme->dests->kobj);
scheme->dests = NULL;
-put_access_pattern_out:
+rmdir_put_access_pattern_out:
+ damon_sysfs_access_pattern_rm_dirs(scheme->access_pattern);
kobject_put(&scheme->access_pattern->kobj);
scheme->access_pattern = NULL;
return err;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-core-introduce-nr_snapshots-damos-stat.patch
mm-damon-sysfs-schemes-introduce-nr_snapshots-damos-stat-file.patch
docs-mm-damon-design-update-for-nr_snapshots-damos-stat.patch
docs-admin-guide-mm-damon-usage-update-for-nr_snapshots-damos-stat.patch
docs-abi-damon-update-for-nr_snapshots-damos-stat.patch
mm-damon-update-damos-kerneldoc-for-stat-field.patch
mm-damon-core-implement-max_nr_snapshots.patch
mm-damon-sysfs-schemes-implement-max_nr_snapshots-file.patch
docs-mm-damon-design-update-for-max_nr_snapshots.patch
docs-admin-guide-mm-damon-usage-update-for-max_nr_snapshots.patch
docs-abi-damon-update-for-max_nr_snapshots.patch
mm-damon-core-add-trace-point-for-damos-stat-per-apply-interval.patch
mm-damon-tests-core-kunit-add-test-cases-for-multiple-regions-in-damon_test_split_regions_of-fix.patch
The quilt patch titled
Subject: mm/damon/sysfs-scheme: cleanup quotas subdirs on scheme dir setup failure
has been removed from the -mm tree. Its filename was
mm-damon-sysfs-scheme-cleanup-quotas-subdirs-on-scheme-dir-setup-failure.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/sysfs-scheme: cleanup quotas subdirs on scheme dir setup failure
Date: Wed, 24 Dec 2025 18:30:36 -0800
When a DAMOS-scheme DAMON sysfs directory setup fails after setup of
quotas/ directory, subdirectories of quotas/ directory are not cleaned up.
As a result, DAMON sysfs interface is nearly broken until the system
reboots, and the memory for the unremoved directory is leaked.
Cleanup the directories under such failures.
Link: https://lkml.kernel.org/r/20251225023043.18579-4-sj@kernel.org
Fixes: 1b32234ab087 ("mm/damon/sysfs: support DAMOS watermarks")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: chongjiapeng <jiapeng.chong(a)linux.alibaba.com>
Cc: <stable(a)vger.kernel.org> # 5.18.x
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs-schemes.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/damon/sysfs-schemes.c~mm-damon-sysfs-scheme-cleanup-quotas-subdirs-on-scheme-dir-setup-failure
+++ a/mm/damon/sysfs-schemes.c
@@ -2158,7 +2158,7 @@ static int damon_sysfs_scheme_add_dirs(s
goto put_dests_out;
err = damon_sysfs_scheme_set_watermarks(scheme);
if (err)
- goto put_quotas_access_pattern_out;
+ goto rmdir_put_quotas_access_pattern_out;
err = damos_sysfs_set_filter_dirs(scheme);
if (err)
goto put_watermarks_quotas_access_pattern_out;
@@ -2183,7 +2183,8 @@ put_filters_watermarks_quotas_access_pat
put_watermarks_quotas_access_pattern_out:
kobject_put(&scheme->watermarks->kobj);
scheme->watermarks = NULL;
-put_quotas_access_pattern_out:
+rmdir_put_quotas_access_pattern_out:
+ damon_sysfs_quotas_rm_dirs(scheme->quotas);
kobject_put(&scheme->quotas->kobj);
scheme->quotas = NULL;
put_dests_out:
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-core-introduce-nr_snapshots-damos-stat.patch
mm-damon-sysfs-schemes-introduce-nr_snapshots-damos-stat-file.patch
docs-mm-damon-design-update-for-nr_snapshots-damos-stat.patch
docs-admin-guide-mm-damon-usage-update-for-nr_snapshots-damos-stat.patch
docs-abi-damon-update-for-nr_snapshots-damos-stat.patch
mm-damon-update-damos-kerneldoc-for-stat-field.patch
mm-damon-core-implement-max_nr_snapshots.patch
mm-damon-sysfs-schemes-implement-max_nr_snapshots-file.patch
docs-mm-damon-design-update-for-max_nr_snapshots.patch
docs-admin-guide-mm-damon-usage-update-for-max_nr_snapshots.patch
docs-abi-damon-update-for-max_nr_snapshots.patch
mm-damon-core-add-trace-point-for-damos-stat-per-apply-interval.patch
mm-damon-tests-core-kunit-add-test-cases-for-multiple-regions-in-damon_test_split_regions_of-fix.patch
The quilt patch titled
Subject: mm/damon/sysfs: cleanup attrs subdirs on context dir setup failure
has been removed from the -mm tree. Its filename was
mm-damon-sysfs-cleanup-attrs-subdirs-on-context-dir-setup-failure.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/sysfs: cleanup attrs subdirs on context dir setup failure
Date: Wed, 24 Dec 2025 18:30:35 -0800
When a context DAMON sysfs directory setup is failed after setup of attrs/
directory, subdirectories of attrs/ directory are not cleaned up. As a
result, DAMON sysfs interface is nearly broken until the system reboots,
and the memory for the unremoved directory is leaked.
Cleanup the directories under such failures.
Link: https://lkml.kernel.org/r/20251225023043.18579-3-sj@kernel.org
Fixes: c951cd3b8901 ("mm/damon: implement a minimal stub for sysfs-based DAMON interface")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: chongjiapeng <jiapeng.chong(a)linux.alibaba.com>
Cc: <stable(a)vger.kernel.org> # 5.18.x
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/damon/sysfs.c~mm-damon-sysfs-cleanup-attrs-subdirs-on-context-dir-setup-failure
+++ a/mm/damon/sysfs.c
@@ -950,7 +950,7 @@ static int damon_sysfs_context_add_dirs(
err = damon_sysfs_context_set_targets(context);
if (err)
- goto put_attrs_out;
+ goto rmdir_put_attrs_out;
err = damon_sysfs_context_set_schemes(context);
if (err)
@@ -960,7 +960,8 @@ static int damon_sysfs_context_add_dirs(
put_targets_attrs_out:
kobject_put(&context->targets->kobj);
context->targets = NULL;
-put_attrs_out:
+rmdir_put_attrs_out:
+ damon_sysfs_attrs_rm_dirs(context->attrs);
kobject_put(&context->attrs->kobj);
context->attrs = NULL;
return err;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-core-introduce-nr_snapshots-damos-stat.patch
mm-damon-sysfs-schemes-introduce-nr_snapshots-damos-stat-file.patch
docs-mm-damon-design-update-for-nr_snapshots-damos-stat.patch
docs-admin-guide-mm-damon-usage-update-for-nr_snapshots-damos-stat.patch
docs-abi-damon-update-for-nr_snapshots-damos-stat.patch
mm-damon-update-damos-kerneldoc-for-stat-field.patch
mm-damon-core-implement-max_nr_snapshots.patch
mm-damon-sysfs-schemes-implement-max_nr_snapshots-file.patch
docs-mm-damon-design-update-for-max_nr_snapshots.patch
docs-admin-guide-mm-damon-usage-update-for-max_nr_snapshots.patch
docs-abi-damon-update-for-max_nr_snapshots.patch
mm-damon-core-add-trace-point-for-damos-stat-per-apply-interval.patch
mm-damon-tests-core-kunit-add-test-cases-for-multiple-regions-in-damon_test_split_regions_of-fix.patch
The quilt patch titled
Subject: mm/damon/sysfs: cleanup intervals subdirs on attrs dir setup failure
has been removed from the -mm tree. Its filename was
mm-damon-sysfs-cleanup-intervals-subdirs-on-attrs-dir-setup-failure.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/sysfs: cleanup intervals subdirs on attrs dir setup failure
Date: Wed, 24 Dec 2025 18:30:34 -0800
Patch series "mm/damon/sysfs: free setup failures generated zombie sub-sub
dirs".
Some DAMON sysfs directory setup functions generates its sub and sub-sub
directories. For example, 'monitoring_attrs/' directory setup creates
'intervals/' and 'intervals/intervals_goal/' directories under
'monitoring_attrs/' directory. When such sub-sub directories are
successfully made but followup setup is failed, the setup function should
recursively clean up the subdirectories.
However, such setup functions are only dereferencing sub directory
reference counters. As a result, under certain setup failures, the
sub-sub directories keep having non-zero reference counters. It means the
directories cannot be removed like zombies, and the memory for the
directories cannot be freed.
The user impact of this issue is limited due to the following reasons.
When the issue happens, the zombie directories are still taking the path.
Hence attempts to generate the directories again will fail, without
additional memory leak. This means the upper bound memory leak is
limited. Nonetheless this also implies controlling DAMON with a feature
that requires the setup-failed sysfs files will be impossible until the
system reboots.
Also, the setup operations are quite simple. The certain failures would
hence only rarely happen, and are difficult to artificially trigger.
This patch (of 4):
When attrs/ DAMON sysfs directory setup is failed after setup of
intervals/ directory, intervals/intervals_goal/ directory is not cleaned
up. As a result, DAMON sysfs interface is nearly broken until the system
reboots, and the memory for the unremoved directory is leaked.
Cleanup the directory under such failures.
Link: https://lkml.kernel.org/r/20251225023043.18579-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20251225023043.18579-2-sj@kernel.org
Fixes: 8fbbcbeaafeb ("mm/damon/sysfs: implement intervals tuning goal directory")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: chongjiapeng <jiapeng.chong(a)linux.alibaba.com>
Cc: <stable(a)vger.kernel.org> # 6.15.x
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/sysfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/damon/sysfs.c~mm-damon-sysfs-cleanup-intervals-subdirs-on-attrs-dir-setup-failure
+++ a/mm/damon/sysfs.c
@@ -792,7 +792,7 @@ static int damon_sysfs_attrs_add_dirs(st
nr_regions_range = damon_sysfs_ul_range_alloc(10, 1000);
if (!nr_regions_range) {
err = -ENOMEM;
- goto put_intervals_out;
+ goto rmdir_put_intervals_out;
}
err = kobject_init_and_add(&nr_regions_range->kobj,
@@ -806,6 +806,8 @@ static int damon_sysfs_attrs_add_dirs(st
put_nr_regions_intervals_out:
kobject_put(&nr_regions_range->kobj);
attrs->nr_regions_range = NULL;
+rmdir_put_intervals_out:
+ damon_sysfs_intervals_rm_dirs(intervals);
put_intervals_out:
kobject_put(&intervals->kobj);
attrs->intervals = NULL;
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-core-introduce-nr_snapshots-damos-stat.patch
mm-damon-sysfs-schemes-introduce-nr_snapshots-damos-stat-file.patch
docs-mm-damon-design-update-for-nr_snapshots-damos-stat.patch
docs-admin-guide-mm-damon-usage-update-for-nr_snapshots-damos-stat.patch
docs-abi-damon-update-for-nr_snapshots-damos-stat.patch
mm-damon-update-damos-kerneldoc-for-stat-field.patch
mm-damon-core-implement-max_nr_snapshots.patch
mm-damon-sysfs-schemes-implement-max_nr_snapshots-file.patch
docs-mm-damon-design-update-for-max_nr_snapshots.patch
docs-admin-guide-mm-damon-usage-update-for-max_nr_snapshots.patch
docs-abi-damon-update-for-max_nr_snapshots.patch
mm-damon-core-add-trace-point-for-damos-stat-per-apply-interval.patch
mm-damon-tests-core-kunit-add-test-cases-for-multiple-regions-in-damon_test_split_regions_of-fix.patch
The quilt patch titled
Subject: mm/damon/core: remove call_control in inactive contexts
has been removed from the -mm tree. Its filename was
mm-damon-core-remove-call_control-in-inactive-contexts.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: mm/damon/core: remove call_control in inactive contexts
Date: Tue, 30 Dec 2025 17:23:13 -0800
If damon_call() is executed against a DAMON context that is not running,
the function returns error while keeping the damon_call_control object
linked to the context's call_controls list. Let's suppose the object is
deallocated after the damon_call(), and yet another damon_call() is
executed against the same context. The function tries to add the new
damon_call_control object to the call_controls list, which still has the
pointer to the previous damon_call_control object, which is deallocated.
As a result, use-after-free happens.
This can actually be triggered using the DAMON sysfs interface. It is not
easily exploitable since it requires the sysfs write permission and making
a definitely weird file writes, though. Please refer to the report for
more details about the issue reproduction steps.
Fix the issue by making two changes. Firstly, move the final
kdamond_call() for cancelling all existing damon_call() requests from
terminating DAMON context to be done before the ctx->kdamond reset. This
makes any code that sees NULL ctx->kdamond can safely assume the context
may not access damon_call() requests anymore. Secondly, let damon_call()
to cleanup the damon_call_control objects that were added to the
already-terminated DAMON context, before returning the error.
Link: https://lkml.kernel.org/r/20251231012315.75835-1-sj@kernel.org
Fixes: 004ded6bee11 ("mm/damon: accept parallel damon_call() requests")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Reported-by: JaeJoon Jung <rgbi3307(a)gmail.com>
Closes: https://lore.kernel.org/20251224094401.20384-1-rgbi3307@gmail.com
Cc: <stable(a)vger.kernel.org> # 6.17.x
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/core.c | 33 +++++++++++++++++++++++++++++++--
1 file changed, 31 insertions(+), 2 deletions(-)
--- a/mm/damon/core.c~mm-damon-core-remove-call_control-in-inactive-contexts
+++ a/mm/damon/core.c
@@ -1431,6 +1431,35 @@ bool damon_is_running(struct damon_ctx *
return running;
}
+/*
+ * damon_call_handle_inactive_ctx() - handle DAMON call request that added to
+ * an inactive context.
+ * @ctx: The inactive DAMON context.
+ * @control: Control variable of the call request.
+ *
+ * This function is called in a case that @control is added to @ctx but @ctx is
+ * not running (inactive). See if @ctx handled @control or not, and cleanup
+ * @control if it was not handled.
+ *
+ * Returns 0 if @control was handled by @ctx, negative error code otherwise.
+ */
+static int damon_call_handle_inactive_ctx(
+ struct damon_ctx *ctx, struct damon_call_control *control)
+{
+ struct damon_call_control *c;
+
+ mutex_lock(&ctx->call_controls_lock);
+ list_for_each_entry(c, &ctx->call_controls, list) {
+ if (c == control) {
+ list_del(&control->list);
+ mutex_unlock(&ctx->call_controls_lock);
+ return -EINVAL;
+ }
+ }
+ mutex_unlock(&ctx->call_controls_lock);
+ return 0;
+}
+
/**
* damon_call() - Invoke a given function on DAMON worker thread (kdamond).
* @ctx: DAMON context to call the function for.
@@ -1461,7 +1490,7 @@ int damon_call(struct damon_ctx *ctx, st
list_add_tail(&control->list, &ctx->call_controls);
mutex_unlock(&ctx->call_controls_lock);
if (!damon_is_running(ctx))
- return -EINVAL;
+ return damon_call_handle_inactive_ctx(ctx, control);
if (control->repeat)
return 0;
wait_for_completion(&control->completion);
@@ -2755,13 +2784,13 @@ done:
if (ctx->ops.cleanup)
ctx->ops.cleanup(ctx);
kfree(ctx->regions_score_histogram);
+ kdamond_call(ctx, true);
pr_debug("kdamond (%d) finishes\n", current->pid);
mutex_lock(&ctx->kdamond_lock);
ctx->kdamond = NULL;
mutex_unlock(&ctx->kdamond_lock);
- kdamond_call(ctx, true);
damos_walk_cancel(ctx);
mutex_lock(&damon_lock);
_
Patches currently in -mm which might be from sj(a)kernel.org are
mm-damon-core-introduce-nr_snapshots-damos-stat.patch
mm-damon-sysfs-schemes-introduce-nr_snapshots-damos-stat-file.patch
docs-mm-damon-design-update-for-nr_snapshots-damos-stat.patch
docs-admin-guide-mm-damon-usage-update-for-nr_snapshots-damos-stat.patch
docs-abi-damon-update-for-nr_snapshots-damos-stat.patch
mm-damon-update-damos-kerneldoc-for-stat-field.patch
mm-damon-core-implement-max_nr_snapshots.patch
mm-damon-sysfs-schemes-implement-max_nr_snapshots-file.patch
docs-mm-damon-design-update-for-max_nr_snapshots.patch
docs-admin-guide-mm-damon-usage-update-for-max_nr_snapshots.patch
docs-abi-damon-update-for-max_nr_snapshots.patch
mm-damon-core-add-trace-point-for-damos-stat-per-apply-interval.patch
mm-damon-tests-core-kunit-add-test-cases-for-multiple-regions-in-damon_test_split_regions_of-fix.patch
The quilt patch titled
Subject: mm/page_alloc: make percpu_pagelist_high_fraction reads lock-free
has been removed from the -mm tree. Its filename was
mm-page_alloc-make-percpu_pagelist_high_fraction-reads-lock-free.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Aboorva Devarajan <aboorvad(a)linux.ibm.com>
Subject: mm/page_alloc: make percpu_pagelist_high_fraction reads lock-free
Date: Mon, 1 Dec 2025 11:30:09 +0530
When page isolation loops indefinitely during memory offline, reading
/proc/sys/vm/percpu_pagelist_high_fraction blocks on pcp_batch_high_lock,
causing hung task warnings.
Make procfs reads lock-free since percpu_pagelist_high_fraction is a
simple integer with naturally atomic reads, writers still serialize via
the mutex.
This prevents hung task warnings when reading the procfs file during
long-running memory offline operations.
[akpm(a)linux-foundation.org: add comment, per Michal]
Link: https://lkml.kernel.org/r/aS_y9AuJQFydLEXo@tiehlicka
Link: https://lkml.kernel.org/r/20251201060009.1420792-1-aboorvad@linux.ibm.com
Signed-off-by: Aboorva Devarajan <aboorvad(a)linux.ibm.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Brendan Jackman <jackmanb(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-page_alloc-make-percpu_pagelist_high_fraction-reads-lock-free
+++ a/mm/page_alloc.c
@@ -6667,11 +6667,19 @@ static int percpu_pagelist_high_fraction
int old_percpu_pagelist_high_fraction;
int ret;
+ /*
+ * Avoid using pcp_batch_high_lock for reads as the value is read
+ * atomically and a race with offlining is harmless.
+ */
+
+ if (!write)
+ return proc_dointvec_minmax(table, write, buffer, length, ppos);
+
mutex_lock(&pcp_batch_high_lock);
old_percpu_pagelist_high_fraction = percpu_pagelist_high_fraction;
ret = proc_dointvec_minmax(table, write, buffer, length, ppos);
- if (!write || ret < 0)
+ if (ret < 0)
goto out;
/* Sanity checking to avoid pcp imbalance */
_
Patches currently in -mm which might be from aboorvad(a)linux.ibm.com are
The quilt patch titled
Subject: lib/buildid: use __kernel_read() for sleepable context
has been removed from the -mm tree. Its filename was
lib-buildid-use-__kernel_read-for-sleepable-context.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Shakeel Butt <shakeel.butt(a)linux.dev>
Subject: lib/buildid: use __kernel_read() for sleepable context
Date: Mon, 22 Dec 2025 12:58:59 -0800
Prevent a "BUG: unable to handle kernel NULL pointer dereference in
filemap_read_folio".
For the sleepable context, convert freader to use __kernel_read() instead
of direct page cache access via read_cache_folio(). This simplifies the
faultable code path by using the standard kernel file reading interface
which handles all the complexity of reading file data.
At the moment we are not changing the code for non-sleepable context which
uses filemap_get_folio() and only succeeds if the target folios are
already in memory and up-to-date. The reason is to keep the patch simple
and easier to backport to stable kernels.
Syzbot repro does not crash the kernel anymore and the selftests run
successfully.
In the follow up we will make __kernel_read() with IOCB_NOWAIT work for
non-sleepable contexts. In addition, I would like to replace the
secretmem check with a more generic approach and will add fstest for the
buildid code.
Link: https://lkml.kernel.org/r/20251222205859.3968077-1-shakeel.butt@linux.dev
Fixes: ad41251c290d ("lib/buildid: implement sleepable build_id_parse() API")
Reported-by: syzbot+09b7d050e4806540153d(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=09b7d050e4806540153d
Signed-off-by: Shakeel Butt <shakeel.butt(a)linux.dev>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Tested-by: Jinchao Wang <wangjinchao600(a)gmail.com>
Link: https://lkml.kernel.org/r/aUteBPWPYzVWIZFH@ndev
Reviewed-by: Christian Brauner <brauner(a)kernel.org>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Andrii Nakryiko <andrii(a)kernel.org>
Cc: Daniel Borkman <daniel(a)iogearbox.net>
Cc: "Darrick J. Wong" <djwong(a)kernel.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/buildid.c | 32 ++++++++++++++++++++------------
1 file changed, 20 insertions(+), 12 deletions(-)
--- a/lib/buildid.c~lib-buildid-use-__kernel_read-for-sleepable-context
+++ a/lib/buildid.c
@@ -5,6 +5,7 @@
#include <linux/elf.h>
#include <linux/kernel.h>
#include <linux/pagemap.h>
+#include <linux/fs.h>
#include <linux/secretmem.h>
#define BUILD_ID 3
@@ -46,20 +47,9 @@ static int freader_get_folio(struct frea
freader_put_folio(r);
- /* reject secretmem folios created with memfd_secret() */
- if (secretmem_mapping(r->file->f_mapping))
- return -EFAULT;
-
+ /* only use page cache lookup - fail if not already cached */
r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT);
- /* if sleeping is allowed, wait for the page, if necessary */
- if (r->may_fault && (IS_ERR(r->folio) || !folio_test_uptodate(r->folio))) {
- filemap_invalidate_lock_shared(r->file->f_mapping);
- r->folio = read_cache_folio(r->file->f_mapping, file_off >> PAGE_SHIFT,
- NULL, r->file);
- filemap_invalidate_unlock_shared(r->file->f_mapping);
- }
-
if (IS_ERR(r->folio) || !folio_test_uptodate(r->folio)) {
if (!IS_ERR(r->folio))
folio_put(r->folio);
@@ -97,6 +87,24 @@ const void *freader_fetch(struct freader
return r->data + file_off;
}
+ /* reject secretmem folios created with memfd_secret() */
+ if (secretmem_mapping(r->file->f_mapping)) {
+ r->err = -EFAULT;
+ return NULL;
+ }
+
+ /* use __kernel_read() for sleepable context */
+ if (r->may_fault) {
+ ssize_t ret;
+
+ ret = __kernel_read(r->file, r->buf, sz, &file_off);
+ if (ret != sz) {
+ r->err = (ret < 0) ? ret : -EIO;
+ return NULL;
+ }
+ return r->buf;
+ }
+
/* fetch or reuse folio for given file offset */
r->err = freader_get_folio(r, file_off);
if (r->err)
_
Patches currently in -mm which might be from shakeel.butt(a)linux.dev are
memcg-introduce-private-id-api-for-in-kernel-users.patch
memcg-expose-mem_cgroup_ino-and-mem_cgroup_get_from_ino-unconditionally.patch
memcg-mem_cgroup_get_from_ino-returns-null-on-error.patch
memcg-use-cgroup_id-instead-of-cgroup_ino-for-memcg-id.patch
mm-damon-use-cgroup-id-instead-of-private-memcg-id.patch
mm-vmscan-use-cgroup-id-instead-of-private-memcg-id-in-lru_gen-interface.patch
memcg-remove-unused-mem_cgroup_id-and-mem_cgroup_from_id.patch
memcg-rename-mem_cgroup_ino-to-mem_cgroup_id.patch
memcg-rename-mem_cgroup_ino-to-mem_cgroup_id-fix.patch
With function virtio_crypto_skcipher_crypt_req(), there is already
virtqueue_kick() call with spinlock held in function
__virtio_crypto_skcipher_do_req(). Remove duplicated virtqueue_kick()
function call here.
Fixes: d79b5d0bbf2e ("crypto: virtio - support crypto engine framework")
Cc: stable(a)vger.kernel.org
Signed-off-by: Bibo Mao <maobibo(a)loongson.cn>
Acked-by: Jason Wang <jasowang(a)redhat.com>
Acked-by: Michael S. Tsirkin <mst(a)redhat.com>
---
drivers/crypto/virtio/virtio_crypto_skcipher_algs.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/crypto/virtio/virtio_crypto_skcipher_algs.c b/drivers/crypto/virtio/virtio_crypto_skcipher_algs.c
index 1b3fb21a2a7d..11053d1786d4 100644
--- a/drivers/crypto/virtio/virtio_crypto_skcipher_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_skcipher_algs.c
@@ -541,8 +541,6 @@ int virtio_crypto_skcipher_crypt_req(
if (ret < 0)
return ret;
- virtqueue_kick(data_vq->vq);
-
return 0;
}
--
2.39.3
When VM boots with one virtio-crypto PCI device and builtin backend,
run openssl benchmark command with multiple processes, such as
openssl speed -evp aes-128-cbc -engine afalg -seconds 10 -multi 32
openssl processes will hangup and there is error reported like this:
virtio_crypto virtio0: dataq.0:id 3 is not a head!
It seems that the data virtqueue need protection when it is handled
for virtio done notification. If the spinlock protection is added
in virtcrypto_done_task(), openssl benchmark with multiple processes
works well.
Fixes: fed93fb62e05 ("crypto: virtio - Handle dataq logic with tasklet")
Cc: stable(a)vger.kernel.org
Signed-off-by: Bibo Mao <maobibo(a)loongson.cn>
Acked-by: Jason Wang <jasowang(a)redhat.com>
Acked-by: Michael S. Tsirkin <mst(a)redhat.com>
---
drivers/crypto/virtio/virtio_crypto_core.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c
index 3d241446099c..ccc6b5c1b24b 100644
--- a/drivers/crypto/virtio/virtio_crypto_core.c
+++ b/drivers/crypto/virtio/virtio_crypto_core.c
@@ -75,15 +75,20 @@ static void virtcrypto_done_task(unsigned long data)
struct data_queue *data_vq = (struct data_queue *)data;
struct virtqueue *vq = data_vq->vq;
struct virtio_crypto_request *vc_req;
+ unsigned long flags;
unsigned int len;
+ spin_lock_irqsave(&data_vq->lock, flags);
do {
virtqueue_disable_cb(vq);
while ((vc_req = virtqueue_get_buf(vq, &len)) != NULL) {
+ spin_unlock_irqrestore(&data_vq->lock, flags);
if (vc_req->alg_cb)
vc_req->alg_cb(vc_req, len);
+ spin_lock_irqsave(&data_vq->lock, flags);
}
} while (!virtqueue_enable_cb(vq));
+ spin_unlock_irqrestore(&data_vq->lock, flags);
}
static void virtcrypto_dataq_callback(struct virtqueue *vq)
--
2.39.3
When ublk_ctrl_start_dev() fails after waiting for completion, the
device needs to be properly cancelled to prevent leaving it in an
inconsistent state. Without this, pending I/O commands may remain
uncompleted and the device cannot be cleanly removed.
Add ublk_cancel_dev() call in the error path to ensure proper cleanup
when START_DEV fails.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
---
drivers/block/ublk_drv.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index f6e5a0766721..2d6250d61a7b 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -2953,8 +2953,10 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub,
if (wait_for_completion_interruptible(&ub->completion) != 0)
return -EINTR;
- if (ub->ublksrv_tgid != ublksrv_pid)
- return -EINVAL;
+ if (ub->ublksrv_tgid != ublksrv_pid) {
+ ret = -EINVAL;
+ goto out;
+ }
mutex_lock(&ub->mutex);
if (ub->dev_info.state == UBLK_S_DEV_LIVE ||
@@ -3017,6 +3019,9 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub,
put_disk(disk);
out_unlock:
mutex_unlock(&ub->mutex);
+out:
+ if (ret)
+ ublk_cancel_dev(ub);
return ret;
}
--
2.47.0
Hi Greg and Sasha,
Commit 70740454377f ("drm/amd/display: Apply e4479aecf658 to dml") was
recently queued up for 6.18. This same change will be needed in 6.12 but
it will need commit 820ccf8cb2b1 ("drm/amd/display: Respect user's
CONFIG_FRAME_WARN more for dml files") applied first for it to apply
cleanly. Please let me know if there are any problems with this.
Cheers,
Nathan
Review given to v2 [1] of commit fc259b024cb3 ("dt-bindings: usb: Add
binding for PS5511 hub controller") asked to use unevaluatedProperties,
but this was ignored by the author probably because current dtschema
does not allow to use both additionalProperties and
unevaluatedProperties. As an effect, this binding does not end with
unevaluatedProperties and allows any properties to be added.
Fix this by reverting the approach suggested at v2 review and using
simpler definition of "reg" constraints.
Link: https://lore.kernel.org/r/20250416180023.GB3327258-robh@kernel.org/ [1]
Fixes: fc259b024cb3 ("dt-bindings: usb: Add binding for PS5511 hub controller")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)oss.qualcomm.com>
---
.../devicetree/bindings/usb/parade,ps5511.yaml | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/Documentation/devicetree/bindings/usb/parade,ps5511.yaml b/Documentation/devicetree/bindings/usb/parade,ps5511.yaml
index 10d002f09db8..154d779e507a 100644
--- a/Documentation/devicetree/bindings/usb/parade,ps5511.yaml
+++ b/Documentation/devicetree/bindings/usb/parade,ps5511.yaml
@@ -15,6 +15,10 @@ properties:
- usb1da0,5511
- usb1da0,55a1
+ reg:
+ minimum: 1
+ maximum: 5
+
reset-gpios:
items:
- description: GPIO specifier for RESETB pin.
@@ -41,12 +45,6 @@ properties:
minimum: 1
maximum: 5
-additionalProperties:
- properties:
- reg:
- minimum: 1
- maximum: 5
-
required:
- peer-hub
@@ -67,6 +65,8 @@ allOf:
patternProperties:
'^.*@5$': false
+unevaluatedProperties: false
+
examples:
- |
usb {
--
2.51.0
Currently pivot_root() doesn't work on the real rootfs because it
cannot be unmounted. Userspace has to do a recursive removal of the
initramfs contents manually before continuing the boot.
Really all we want from the real rootfs is to serve as the parent mount
for anything that is actually useful such as the tmpfs or ramfs for
initramfs unpacking or the rootfs itself. There's no need for the real
rootfs to actually be anything meaningful or useful. Add a immutable
rootfs called "nullfs" that can be selected via the "nullfs_rootfs"
kernel command line option.
The kernel will mount a tmpfs/ramfs on top of it, unpack the initramfs
and fire up userspace which mounts the rootfs and can then just do:
chdir(rootfs);
pivot_root(".", ".");
umount2(".", MNT_DETACH);
and be done with it. (Ofc, userspace can also choose to retain the
initramfs contents by using something like pivot_root(".", "/initramfs")
without unmounting it.)
Technically this also means that the rootfs mount in unprivileged
namespaces doesn't need to become MNT_LOCKED anymore as it's guaranteed
that the immutable rootfs remains permanently empty so there cannot be
anything revealed by unmounting the covering mount.
In the future this will also allow us to create completely empty mount
namespaces without risking to leak anything.
systemd already handles this all correctly as it tries to pivot_root()
first and falls back to MS_MOVE only when that fails.
This goes back to various discussion in previous years and a LPC 2024
presentation about this very topic.
Now in vfs-7.0.nullfs.
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
---
Changes in v2:
- Rename to "nullfs".
- Update documentation.
- Link to v1: https://patch.msgid.link/20260102-work-immutable-rootfs-v1-0-f2073b2d1602@k…
---
Christian Brauner (4):
fs: ensure that internal tmpfs mount gets mount id zero
fs: add init_pivot_root()
fs: add immutable rootfs
docs: mention nullfs
.../filesystems/ramfs-rootfs-initramfs.rst | 32 +++-
fs/Makefile | 2 +-
fs/init.c | 17 ++
fs/internal.h | 1 +
fs/mount.h | 1 +
fs/namespace.c | 181 ++++++++++++++-------
fs/nullfs.c | 70 ++++++++
include/linux/init_syscalls.h | 1 +
include/uapi/linux/magic.h | 1 +
init/do_mounts.c | 14 ++
init/do_mounts.h | 1 +
11 files changed, 254 insertions(+), 67 deletions(-)
---
base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
change-id: 20260102-work-immutable-rootfs-b5f23e0f5a27
The patch titled
Subject: mm/vmalloc: prevent RCU stalls in kasan_release_vmalloc_node
has been added to the -mm mm-new branch. Its filename is
mm-vmalloc-prevent-rcu-stalls-in-kasan_release_vmalloc_node.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
The mm-new branch of mm.git is not included in linux-next
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Deepanshu Kartikey <kartikey406(a)gmail.com>
Subject: mm/vmalloc: prevent RCU stalls in kasan_release_vmalloc_node
Date: Mon, 12 Jan 2026 16:06:12 +0530
When CONFIG_PAGE_OWNER is enabled, freeing KASAN shadow pages during
vmalloc cleanup triggers expensive stack unwinding that acquires RCU read
locks. Processing a large purge_list without rescheduling can cause the
task to hold CPU for extended periods (10+ seconds), leading to RCU stalls
and potential OOM conditions.
The issue manifests in purge_vmap_node() -> kasan_release_vmalloc_node()
where iterating through hundreds or thousands of vmap_area entries and
freeing their associated shadow pages causes:
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: Tasks blocked on level-0 rcu_node (CPUs 0-1): P6229/1:b..l
...
task:kworker/0:17 state:R running task stack:28840 pid:6229
...
kasan_release_vmalloc_node+0x1ba/0xad0 mm/vmalloc.c:2299
purge_vmap_node+0x1ba/0xad0 mm/vmalloc.c:2299
Each call to kasan_release_vmalloc() can free many pages, and with
page_owner tracking, each free triggers save_stack() which performs stack
unwinding under RCU read lock. Without yielding, this creates an
unbounded RCU critical section.
Add periodic cond_resched() calls within the loop to allow:
- RCU grace periods to complete
- Other tasks to run
- Scheduler to preempt when needed
The fix uses need_resched() for immediate response under load, with a
batch count of 32 as a guaranteed upper bound to prevent worst-case stalls
even under light load.
Link: https://lkml.kernel.org/r/20260112103612.627247-1-kartikey406@gmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406(a)gmail.com>
Reported-by: syzbot+d8d4c31d40f868eaea30(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d8d4c31d40f868eaea30
Link: https://lore.kernel.org/all/20260112084723.622910-1-kartikey406@gmail.com/T/ [v1]
Suggested-by: Uladzislau Rezki <urezki(a)gmail.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmalloc.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/mm/vmalloc.c~mm-vmalloc-prevent-rcu-stalls-in-kasan_release_vmalloc_node
+++ a/mm/vmalloc.c
@@ -2273,11 +2273,14 @@ decay_va_pool_node(struct vmap_node *vn,
reclaim_list_global(&decay_list);
}
+#define KASAN_RELEASE_BATCH_SIZE 32
+
static void
kasan_release_vmalloc_node(struct vmap_node *vn)
{
struct vmap_area *va;
unsigned long start, end;
+ unsigned int batch_count = 0;
start = list_first_entry(&vn->purge_list, struct vmap_area, list)->va_start;
end = list_last_entry(&vn->purge_list, struct vmap_area, list)->va_end;
@@ -2287,6 +2290,11 @@ kasan_release_vmalloc_node(struct vmap_n
kasan_release_vmalloc(va->va_start, va->va_end,
va->va_start, va->va_end,
KASAN_VMALLOC_PAGE_RANGE);
+
+ if (need_resched() || (++batch_count >= KASAN_RELEASE_BATCH_SIZE)) {
+ cond_resched();
+ batch_count = 0;
+ }
}
kasan_release_vmalloc(start, end, start, end, KASAN_VMALLOC_TLB_FLUSH);
_
Patches currently in -mm which might be from kartikey406(a)gmail.com are
mm-swap_cgroup-fix-kernel-bug-in-swap_cgroup_record.patch
mm-vmalloc-prevent-rcu-stalls-in-kasan_release_vmalloc_node.patch
ocfs2-validate-i_refcount_loc-when-refcount-flag-is-set.patch
ocfs2-validate-inline-data-i_size-during-inode-read.patch
ocfs2-add-check-for-free-bits-before-allocation-in-ocfs2_move_extent.patch