The patch titled
Subject: mm/page_alloc: prevent pcp corruption with SMP=n
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-page_alloc-prevent-pcp-corruption-with-smp=n.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: mm/page_alloc: prevent pcp corruption with SMP=n
Date: Mon, 05 Jan 2026 16:08:56 +0100
The kernel test robot has reported:
BUG: spinlock trylock failure on UP on CPU#0, kcompactd0/28
lock: 0xffff888807e35ef0, .magic: dead4ead, .owner: kcompactd0/28, .owner_cpu: 0
CPU: 0 UID: 0 PID: 28 Comm: kcompactd0 Not tainted 6.18.0-rc5-00127-ga06157804399 #1 PREEMPT 8cc09ef94dcec767faa911515ce9e609c45db470
Call Trace:
<IRQ>
__dump_stack (lib/dump_stack.c:95)
dump_stack_lvl (lib/dump_stack.c:123)
dump_stack (lib/dump_stack.c:130)
spin_dump (kernel/locking/spinlock_debug.c:71)
do_raw_spin_trylock (kernel/locking/spinlock_debug.c:?)
_raw_spin_trylock (include/linux/spinlock_api_smp.h:89 kernel/locking/spinlock.c:138)
__free_frozen_pages (mm/page_alloc.c:2973)
___free_pages (mm/page_alloc.c:5295)
__free_pages (mm/page_alloc.c:5334)
tlb_remove_table_rcu (include/linux/mm.h:? include/linux/mm.h:3122 include/asm-generic/tlb.h:220 mm/mmu_gather.c:227 mm/mmu_gather.c:290)
? __cfi_tlb_remove_table_rcu (mm/mmu_gather.c:289)
? rcu_core (kernel/rcu/tree.c:?)
rcu_core (include/linux/rcupdate.h:341 kernel/rcu/tree.c:2607 kernel/rcu/tree.c:2861)
rcu_core_si (kernel/rcu/tree.c:2879)
handle_softirqs (arch/x86/include/asm/jump_label.h:36 include/trace/events/irq.h:142 kernel/softirq.c:623)
__irq_exit_rcu (arch/x86/include/asm/jump_label.h:36 kernel/softirq.c:725)
irq_exit_rcu (kernel/softirq.c:741)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1052)
</IRQ>
<TASK>
RIP: 0010:_raw_spin_unlock_irqrestore (arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194)
free_pcppages_bulk (mm/page_alloc.c:1494)
drain_pages_zone (include/linux/spinlock.h:391 mm/page_alloc.c:2632)
__drain_all_pages (mm/page_alloc.c:2731)
drain_all_pages (mm/page_alloc.c:2747)
kcompactd (mm/compaction.c:3115)
kthread (kernel/kthread.c:465)
? __cfi_kcompactd (mm/compaction.c:3166)
? __cfi_kthread (kernel/kthread.c:412)
ret_from_fork (arch/x86/kernel/process.c:164)
? __cfi_kthread (kernel/kthread.c:412)
ret_from_fork_asm (arch/x86/entry/entry_64.S:255)
</TASK>
Matthew has analyzed the report and identified that in drain_page_zone()
we are in a section protected by spin_lock(&pcp->lock) and then get an
interrupt that attempts spin_trylock() on the same lock. The code is
designed to work this way without disabling IRQs and occasionally fail the
trylock with a fallback. However, the SMP=n spinlock implementation
assumes spin_trylock() will always succeed, and thus it's normally a
no-op. Here the enabled lock debugging catches the problem, but otherwise
it could cause a corruption of the pcp structure.
The problem has been introduced by commit 574907741599 ("mm/page_alloc:
leave IRQs enabled for per-cpu page allocations"). The pcp locking scheme
recognizes the need for disabling IRQs to prevent nesting spin_trylock()
sections on SMP=n, but the need to prevent the nesting in spin_lock() has
not been recognized. Fix it by introducing local wrappers that change the
spin_lock() to spin_lock_iqsave() with SMP=n and use them in all places
that do spin_lock(&pcp->lock).
Link: https://lkml.kernel.org/r/20260105-fix-pcp-up-v1-1-5579662d2071@suse.cz
Fixes: 574907741599 ("mm/page_alloc: leave IRQs enabled for per-cpu page allocations")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Closes: https://lore.kernel.org/oe-lkp/202512101320.e2f2dd6f-lkp@intel.com
Analyzed-by: Matthew Wilcox <willy(a)infradead.org>
Link: https://lore.kernel.org/all/aUW05pyc9nZkvY-1@casper.infradead.org/
Cc: Brendan Jackman <jackmanb(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 45 +++++++++++++++++++++++++++++++++++++--------
1 file changed, 37 insertions(+), 8 deletions(-)
--- a/mm/page_alloc.c~mm-page_alloc-prevent-pcp-corruption-with-smp=n
+++ a/mm/page_alloc.c
@@ -167,6 +167,31 @@ static inline void __pcp_trylock_noop(un
pcp_trylock_finish(UP_flags); \
})
+/*
+ * With the UP spinlock implementation, when we spin_lock(&pcp->lock) (for i.e.
+ * a potentially remote cpu drain) and get interrupted by an operation that
+ * attempts pcp_spin_trylock(), we can't rely on the trylock failure due to UP
+ * spinlock assumptions making the trylock a no-op. So we have to turn that
+ * spin_lock() to a spin_lock_irqsave(). This works because on UP there are no
+ * remote cpu's so we can only be locking the only existing local one.
+ */
+#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)
+static inline void __flags_noop(unsigned long *flags) { }
+#define spin_lock_maybe_irqsave(lock, flags) \
+({ \
+ __flags_noop(&(flags)); \
+ spin_lock(lock); \
+})
+#define spin_unlock_maybe_irqrestore(lock, flags) \
+({ \
+ spin_unlock(lock); \
+ __flags_noop(&(flags)); \
+})
+#else
+#define spin_lock_maybe_irqsave(lock, flags) spin_lock_irqsave(lock, flags)
+#define spin_unlock_maybe_irqrestore(lock, flags) spin_unlock_irqrestore(lock, flags)
+#endif
+
#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
DEFINE_PER_CPU(int, numa_node);
EXPORT_PER_CPU_SYMBOL(numa_node);
@@ -2556,6 +2581,7 @@ static int rmqueue_bulk(struct zone *zon
bool decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp)
{
int high_min, to_drain, to_drain_batched, batch;
+ unsigned long UP_flags;
bool todo = false;
high_min = READ_ONCE(pcp->high_min);
@@ -2575,9 +2601,9 @@ bool decay_pcp_high(struct zone *zone, s
to_drain = pcp->count - pcp->high;
while (to_drain > 0) {
to_drain_batched = min(to_drain, batch);
- spin_lock(&pcp->lock);
+ spin_lock_maybe_irqsave(&pcp->lock, UP_flags);
free_pcppages_bulk(zone, to_drain_batched, pcp, 0);
- spin_unlock(&pcp->lock);
+ spin_unlock_maybe_irqrestore(&pcp->lock, UP_flags);
todo = true;
to_drain -= to_drain_batched;
@@ -2594,14 +2620,15 @@ bool decay_pcp_high(struct zone *zone, s
*/
void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
{
+ unsigned long UP_flags;
int to_drain, batch;
batch = READ_ONCE(pcp->batch);
to_drain = min(pcp->count, batch);
if (to_drain > 0) {
- spin_lock(&pcp->lock);
+ spin_lock_maybe_irqsave(&pcp->lock, UP_flags);
free_pcppages_bulk(zone, to_drain, pcp, 0);
- spin_unlock(&pcp->lock);
+ spin_unlock_maybe_irqrestore(&pcp->lock, UP_flags);
}
}
#endif
@@ -2612,10 +2639,11 @@ void drain_zone_pages(struct zone *zone,
static void drain_pages_zone(unsigned int cpu, struct zone *zone)
{
struct per_cpu_pages *pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
+ unsigned long UP_flags;
int count;
do {
- spin_lock(&pcp->lock);
+ spin_lock_maybe_irqsave(&pcp->lock, UP_flags);
count = pcp->count;
if (count) {
int to_drain = min(count,
@@ -2624,7 +2652,7 @@ static void drain_pages_zone(unsigned in
free_pcppages_bulk(zone, to_drain, pcp, 0);
count -= to_drain;
}
- spin_unlock(&pcp->lock);
+ spin_unlock_maybe_irqrestore(&pcp->lock, UP_flags);
} while (count);
}
@@ -6109,6 +6137,7 @@ static void zone_pcp_update_cacheinfo(st
{
struct per_cpu_pages *pcp;
struct cpu_cacheinfo *cci;
+ unsigned long UP_flags;
pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu);
cci = get_cpu_cacheinfo(cpu);
@@ -6119,12 +6148,12 @@ static void zone_pcp_update_cacheinfo(st
* This can reduce zone lock contention without hurting
* cache-hot pages sharing.
*/
- spin_lock(&pcp->lock);
+ spin_lock_maybe_irqsave(&pcp->lock, UP_flags);
if ((cci->per_cpu_data_slice_size >> PAGE_SHIFT) > 3 * pcp->batch)
pcp->flags |= PCPF_FREE_HIGH_BATCH;
else
pcp->flags &= ~PCPF_FREE_HIGH_BATCH;
- spin_unlock(&pcp->lock);
+ spin_unlock_maybe_irqrestore(&pcp->lock, UP_flags);
}
void setup_pcp_cacheinfo(unsigned int cpu)
_
Patches currently in -mm which might be from vbabka(a)suse.cz are
mm-page_alloc-prevent-pcp-corruption-with-smp=n.patch
mm-page_alloc-thp-prevent-reclaim-for-__gfp_thisnode-thp-allocations.patch
__arm_lpae_unmap() returns size_t but was returning -ENOENT (negative
error code) when encountering an unmapped PTE. Since size_t is unsigned,
-ENOENT (typically -2) becomes a huge positive value (0xFFFFFFFFFFFFFFFE
on 64-bit systems).
This corrupted value propagates through the call chain:
__arm_lpae_unmap() returns -ENOENT as size_t
-> arm_lpae_unmap_pages() returns it
-> __iommu_unmap() adds it to iova address
-> iommu_pgsize() triggers BUG_ON due to corrupted iova
This can cause IOVA address overflow in __iommu_unmap() loop and
trigger BUG_ON in iommu_pgsize() from invalid address alignment.
Fix by returning 0 instead of -ENOENT. The WARN_ON already signals
the error condition, and returning 0 (meaning "nothing unmapped")
is the correct semantic for size_t return type. This matches the
behavior of other io-pgtable implementations (io-pgtable-arm-v7s,
io-pgtable-dart) which return 0 on error conditions.
Fixes: 3318f7b5cefb ("iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Chaitanya Kulkarni <ckulkarnilinux(a)gmail.com>
---
drivers/iommu/io-pgtable-arm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index e6626004b323..05d63fe92e43 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -637,7 +637,7 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
pte = READ_ONCE(*ptep);
if (!pte) {
WARN_ON(!(data->iop.cfg.quirks & IO_PGTABLE_QUIRK_NO_WARN));
- return -ENOENT;
+ return 0;
}
/* If the size matches this level, we're in the right place */
--
2.40.0
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 0ef841113724166c3c484d0e9ae6db1eb5634fde
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2026010535-unmindful-hedge-1198@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0ef841113724166c3c484d0e9ae6db1eb5634fde Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Fri, 17 Oct 2025 07:33:20 +0200
Subject: [PATCH] media: vpif_capture: fix section mismatch
Platform drivers can be probed after their init sections have been
discarded (e.g. on probe deferral or manual rebind through sysfs) so the
probe function must not live in init.
Note that commit ffa1b391c61b ("V4L/DVB: vpif_cap/disp: Removed section
mismatch warning") incorrectly suppressed the modpost warning.
Fixes: ffa1b391c61b ("V4L/DVB: vpif_cap/disp: Removed section mismatch warning")
Fixes: 6ffefff5a9e7 ("V4L/DVB (12906c): V4L : vpif capture driver for DM6467")
Cc: stable(a)vger.kernel.org # 2.6.32
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco(a)kernel.org>
diff --git a/drivers/media/platform/ti/davinci/vpif_capture.c b/drivers/media/platform/ti/davinci/vpif_capture.c
index d053972888d1..243c6196b024 100644
--- a/drivers/media/platform/ti/davinci/vpif_capture.c
+++ b/drivers/media/platform/ti/davinci/vpif_capture.c
@@ -1600,7 +1600,7 @@ vpif_capture_get_pdata(struct platform_device *pdev,
* This creates device entries by register itself to the V4L2 driver and
* initializes fields of each channel objects
*/
-static __init int vpif_probe(struct platform_device *pdev)
+static int vpif_probe(struct platform_device *pdev)
{
struct vpif_subdev_info *subdevdata;
struct i2c_adapter *i2c_adap;
@@ -1807,7 +1807,7 @@ static int vpif_resume(struct device *dev)
static SIMPLE_DEV_PM_OPS(vpif_pm_ops, vpif_suspend, vpif_resume);
-static __refdata struct platform_driver vpif_driver = {
+static struct platform_driver vpif_driver = {
.driver = {
.name = VPIF_DRIVER_NAME,
.pm = &vpif_pm_ops,
The patch titled
Subject: mm/vma: enforce VMA fork limit on unfaulted,faulted mremap merge too
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-vma-enforce-vma-fork-limit-on-unfaultedfaulted-mremap-merge-too.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm/vma: enforce VMA fork limit on unfaulted,faulted mremap merge too
Date: Mon, 5 Jan 2026 20:11:49 +0000
The is_mergeable_anon_vma() function uses vmg->middle as the source VMA.
However when merging a new VMA, this field is NULL.
In all cases except mremap(), the new VMA will either be newly established
and thus lack an anon_vma, or will be an expansion of an existing VMA thus
we do not care about whether VMA is CoW'd or not.
In the case of an mremap(), we can end up in a situation where we can
accidentally allow an unfaulted/faulted merge with a VMA that has been
forked, violating the general rule that we do not permit this for reasons
of anon_vma lock scalability.
Now we have the ability to be aware of the fact we are copying a VMA and
also know which VMA that is, we can explicitly check for this, so do so.
This is pertinent since commit 879bca0a2c4f ("mm/vma: fix incorrectly
disallowed anonymous VMA merges"), as this patch permits unfaulted/faulted
merges that were previously disallowed running afoul of this issue.
While we are here, vma_had_uncowed_parents() is a confusing name, so make
it simple and rename it to vma_is_fork_child().
Link: https://lkml.kernel.org/r/6e2b9b3024ae1220961c8b81d74296d4720eaf2b.17676382…
Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: David Hildenbrand (Red Hat) <david(a)kernel.org>
Cc: Jann Horn <jannh(a)google.com>
Cc: Jeongjun Park <aha310510(a)gmail.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Pedro Falcato <pfalcato(a)suse.de>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Yeoreum Yun <yeoreum.yun(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vma.c | 27 +++++++++++++++------------
1 file changed, 15 insertions(+), 12 deletions(-)
--- a/mm/vma.c~mm-vma-enforce-vma-fork-limit-on-unfaultedfaulted-mremap-merge-too
+++ a/mm/vma.c
@@ -67,18 +67,13 @@ struct mmap_state {
.state = VMA_MERGE_START, \
}
-/*
- * If, at any point, the VMA had unCoW'd mappings from parents, it will maintain
- * more than one anon_vma_chain connecting it to more than one anon_vma. A merge
- * would mean a wider range of folios sharing the root anon_vma lock, and thus
- * potential lock contention, we do not wish to encourage merging such that this
- * scales to a problem.
- */
-static bool vma_had_uncowed_parents(struct vm_area_struct *vma)
+/* Was this VMA ever forked from a parent, i.e. maybe contains CoW mappings? */
+static bool vma_is_fork_child(struct vm_area_struct *vma)
{
/*
* The list_is_singular() test is to avoid merging VMA cloned from
- * parents. This can improve scalability caused by anon_vma lock.
+ * parents. This can improve scalability caused by the anon_vma root
+ * lock.
*/
return vma && vma->anon_vma && !list_is_singular(&vma->anon_vma_chain);
}
@@ -115,11 +110,19 @@ static bool is_mergeable_anon_vma(struct
VM_WARN_ON(src && src_anon != src->anon_vma);
/* Case 1 - we will dup_anon_vma() from src into tgt. */
- if (!tgt_anon && src_anon)
- return !vma_had_uncowed_parents(src);
+ if (!tgt_anon && src_anon) {
+ struct vm_area_struct *copied_from = vmg->copied_from;
+
+ if (vma_is_fork_child(src))
+ return false;
+ if (vma_is_fork_child(copied_from))
+ return false;
+
+ return true;
+ }
/* Case 2 - we will simply use tgt's anon_vma. */
if (tgt_anon && !src_anon)
- return !vma_had_uncowed_parents(tgt);
+ return !vma_is_fork_child(tgt);
/* Case 3 - the anon_vma's are already shared. */
return src_anon == tgt_anon;
}
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge.patch
tools-testing-selftests-add-tests-for-tgt-src-mremap-merges.patch
mm-vma-enforce-vma-fork-limit-on-unfaultedfaulted-mremap-merge-too.patch
tools-testing-selftests-add-forked-un-faulted-vma-merge-tests.patch
The patch titled
Subject: tools/testing/selftests: add tests for !tgt, src mremap() merges
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
tools-testing-selftests-add-tests-for-tgt-src-mremap-merges.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: tools/testing/selftests: add tests for !tgt, src mremap() merges
Date: Mon, 5 Jan 2026 20:11:48 +0000
Test that mremap()'ing a VMA into a position such that the target VMA on
merge is unfaulted and the source faulted is correctly performed.
We cover 4 cases:
1. Previous VMA unfaulted:
copied -----|
v
|-----------|.............|
| unfaulted |(faulted VMA)|
|-----------|.............|
prev
target = prev, expand prev to cover.
2. Next VMA unfaulted:
copied -----|
v
|.............|-----------|
|(faulted VMA)| unfaulted |
|.............|-----------|
next
target = next, expand next to cover.
3. Both adjacent VMAs unfaulted:
copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| unfaulted |
|-----------|.............|-----------|
prev next
target = prev, expand prev to cover.
4. prev unfaulted, next faulted:
copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| faulted |
|-----------|.............|-----------|
prev next
target = prev, expand prev to cover. Essentially equivalent to 3, but
with additional requirement that next's anon_vma is the same as the
copied VMA's.
Each of these are performed with MREMAP_DONTUNMAP set, which will cause a
KASAN assert for UAF or an assert on zero refcount anon_vma if a bug
exists with correctly propagating anon_vma state in each scenario.
Link: https://lkml.kernel.org/r/f903af2930c7c2c6e0948c886b58d0f42d8e8ba3.17676382…
Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: David Hildenbrand (Red Hat) <david(a)kernel.org>
Cc: Jann Horn <jannh(a)google.com>
Cc: Jeongjun Park <aha310510(a)gmail.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Pedro Falcato <pfalcato(a)suse.de>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Yeoreum Yun <yeoreum.yun(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/merge.c | 232 +++++++++++++++++++++++++++
1 file changed, 232 insertions(+)
--- a/tools/testing/selftests/mm/merge.c~tools-testing-selftests-add-tests-for-tgt-src-mremap-merges
+++ a/tools/testing/selftests/mm/merge.c
@@ -1171,4 +1171,236 @@ TEST_F(merge, mremap_correct_placed_faul
ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr + 15 * page_size);
}
+TEST_F(merge, mremap_faulted_to_unfaulted_prev)
+{
+ struct procmap_fd *procmap = &self->procmap;
+ unsigned int page_size = self->page_size;
+ char *ptr_a, *ptr_b;
+
+ /*
+ * mremap() such that A and B merge:
+ *
+ * |------------|
+ * | \ |
+ * |-----------| | / |---------|
+ * | unfaulted | v \ | faulted |
+ * |-----------| / |---------|
+ * B \ A
+ */
+
+ /* Map VMA A into place. */
+ ptr_a = mmap(&self->carveout[page_size + 3 * page_size],
+ 3 * page_size,
+ PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
+ ASSERT_NE(ptr_a, MAP_FAILED);
+ /* Fault it in. */
+ ptr_a[0] = 'x';
+
+ /*
+ * Now move it out of the way so we can place VMA B in position,
+ * unfaulted.
+ */
+ ptr_a = mremap(ptr_a, 3 * page_size, 3 * page_size,
+ MREMAP_FIXED | MREMAP_MAYMOVE, &self->carveout[20 * page_size]);
+ ASSERT_NE(ptr_a, MAP_FAILED);
+
+ /* Map VMA B into place. */
+ ptr_b = mmap(&self->carveout[page_size], 3 * page_size,
+ PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
+ ASSERT_NE(ptr_b, MAP_FAILED);
+
+ /*
+ * Now move VMA A into position with MREMAP_DONTUNMAP to catch incorrect
+ * anon_vma propagation.
+ */
+ ptr_a = mremap(ptr_a, 3 * page_size, 3 * page_size,
+ MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP,
+ &self->carveout[page_size + 3 * page_size]);
+ ASSERT_NE(ptr_a, MAP_FAILED);
+
+ /* The VMAs should have merged. */
+ ASSERT_TRUE(find_vma_procmap(procmap, ptr_b));
+ ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_b);
+ ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_b + 6 * page_size);
+}
+
+TEST_F(merge, mremap_faulted_to_unfaulted_next)
+{
+ struct procmap_fd *procmap = &self->procmap;
+ unsigned int page_size = self->page_size;
+ char *ptr_a, *ptr_b;
+
+ /*
+ * mremap() such that A and B merge:
+ *
+ * |---------------------------|
+ * | \ |
+ * | |-----------| / |---------|
+ * v | unfaulted | \ | faulted |
+ * |-----------| / |---------|
+ * B \ A
+ *
+ * Then unmap VMA A to trigger the bug.
+ */
+
+ /* Map VMA A into place. */
+ ptr_a = mmap(&self->carveout[page_size], 3 * page_size,
+ PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
+ ASSERT_NE(ptr_a, MAP_FAILED);
+ /* Fault it in. */
+ ptr_a[0] = 'x';
+
+ /*
+ * Now move it out of the way so we can place VMA B in position,
+ * unfaulted.
+ */
+ ptr_a = mremap(ptr_a, 3 * page_size, 3 * page_size,
+ MREMAP_FIXED | MREMAP_MAYMOVE, &self->carveout[20 * page_size]);
+ ASSERT_NE(ptr_a, MAP_FAILED);
+
+ /* Map VMA B into place. */
+ ptr_b = mmap(&self->carveout[page_size + 3 * page_size], 3 * page_size,
+ PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
+ ASSERT_NE(ptr_b, MAP_FAILED);
+
+ /*
+ * Now move VMA A into position with MREMAP_DONTUNMAP to catch incorrect
+ * anon_vma propagation.
+ */
+ ptr_a = mremap(ptr_a, 3 * page_size, 3 * page_size,
+ MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP,
+ &self->carveout[page_size]);
+ ASSERT_NE(ptr_a, MAP_FAILED);
+
+ /* The VMAs should have merged. */
+ ASSERT_TRUE(find_vma_procmap(procmap, ptr_a));
+ ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_a);
+ ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + 6 * page_size);
+}
+
+TEST_F(merge, mremap_faulted_to_unfaulted_prev_unfaulted_next)
+{
+ struct procmap_fd *procmap = &self->procmap;
+ unsigned int page_size = self->page_size;
+ char *ptr_a, *ptr_b, *ptr_c;
+
+ /*
+ * mremap() with MREMAP_DONTUNMAP such that A, B and C merge:
+ *
+ * |---------------------------|
+ * | \ |
+ * |-----------| | |-----------| / |---------|
+ * | unfaulted | v | unfaulted | \ | faulted |
+ * |-----------| |-----------| / |---------|
+ * A C \ B
+ */
+
+ /* Map VMA B into place. */
+ ptr_b = mmap(&self->carveout[page_size + 3 * page_size], 3 * page_size,
+ PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
+ ASSERT_NE(ptr_b, MAP_FAILED);
+ /* Fault it in. */
+ ptr_b[0] = 'x';
+
+ /*
+ * Now move it out of the way so we can place VMAs A, C in position,
+ * unfaulted.
+ */
+ ptr_b = mremap(ptr_b, 3 * page_size, 3 * page_size,
+ MREMAP_FIXED | MREMAP_MAYMOVE, &self->carveout[20 * page_size]);
+ ASSERT_NE(ptr_b, MAP_FAILED);
+
+ /* Map VMA A into place. */
+
+ ptr_a = mmap(&self->carveout[page_size], 3 * page_size,
+ PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
+ ASSERT_NE(ptr_a, MAP_FAILED);
+
+ /* Map VMA C into place. */
+ ptr_c = mmap(&self->carveout[page_size + 3 * page_size + 3 * page_size],
+ 3 * page_size, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
+ ASSERT_NE(ptr_c, MAP_FAILED);
+
+ /*
+ * Now move VMA B into position with MREMAP_DONTUNMAP to catch incorrect
+ * anon_vma propagation.
+ */
+ ptr_b = mremap(ptr_b, 3 * page_size, 3 * page_size,
+ MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP,
+ &self->carveout[page_size + 3 * page_size]);
+ ASSERT_NE(ptr_b, MAP_FAILED);
+
+ /* The VMAs should have merged. */
+ ASSERT_TRUE(find_vma_procmap(procmap, ptr_a));
+ ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_a);
+ ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + 9 * page_size);
+}
+
+TEST_F(merge, mremap_faulted_to_unfaulted_prev_faulted_next)
+{
+ struct procmap_fd *procmap = &self->procmap;
+ unsigned int page_size = self->page_size;
+ char *ptr_a, *ptr_b, *ptr_bc;
+
+ /*
+ * mremap() with MREMAP_DONTUNMAP such that A, B and C merge:
+ *
+ * |---------------------------|
+ * | \ |
+ * |-----------| | |-----------| / |---------|
+ * | unfaulted | v | faulted | \ | faulted |
+ * |-----------| |-----------| / |---------|
+ * A C \ B
+ */
+
+ /*
+ * Map VMA B and C into place. We have to map them together so their
+ * anon_vma is the same and the vma->vm_pgoff's are correctly aligned.
+ */
+ ptr_bc = mmap(&self->carveout[page_size + 3 * page_size],
+ 3 * page_size + 3 * page_size,
+ PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
+ ASSERT_NE(ptr_bc, MAP_FAILED);
+
+ /* Fault it in. */
+ ptr_bc[0] = 'x';
+
+ /*
+ * Now move VMA B out the way (splitting VMA BC) so we can place VMA A
+ * in position, unfaulted, and leave the remainder of the VMA we just
+ * moved in place, faulted, as VMA C.
+ */
+ ptr_b = mremap(ptr_bc, 3 * page_size, 3 * page_size,
+ MREMAP_FIXED | MREMAP_MAYMOVE, &self->carveout[20 * page_size]);
+ ASSERT_NE(ptr_b, MAP_FAILED);
+
+ /* Map VMA A into place. */
+ ptr_a = mmap(&self->carveout[page_size], 3 * page_size,
+ PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
+ ASSERT_NE(ptr_a, MAP_FAILED);
+
+ /*
+ * Now move VMA B into position with MREMAP_DONTUNMAP to catch incorrect
+ * anon_vma propagation.
+ */
+ ptr_b = mremap(ptr_b, 3 * page_size, 3 * page_size,
+ MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP,
+ &self->carveout[page_size + 3 * page_size]);
+ ASSERT_NE(ptr_b, MAP_FAILED);
+
+ /* The VMAs should have merged. */
+ ASSERT_TRUE(find_vma_procmap(procmap, ptr_a));
+ ASSERT_EQ(procmap->query.vma_start, (unsigned long)ptr_a);
+ ASSERT_EQ(procmap->query.vma_end, (unsigned long)ptr_a + 9 * page_size);
+}
+
TEST_HARNESS_MAIN
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge.patch
tools-testing-selftests-add-tests-for-tgt-src-mremap-merges.patch
mm-vma-enforce-vma-fork-limit-on-unfaultedfaulted-mremap-merge-too.patch
tools-testing-selftests-add-forked-un-faulted-vma-merge-tests.patch
The patch titled
Subject: mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Subject: mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge
Date: Mon, 5 Jan 2026 20:11:47 +0000
Patch series "mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted
merge", v2.
Commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA
merges") introduced the ability to merge previously unavailable VMA merge
scenarios.
However, it is handling merges incorrectly when it comes to mremap() of a
faulted VMA adjacent to an unfaulted VMA. The issues arise in three
cases:
1. Previous VMA unfaulted:
copied -----|
v
|-----------|.............|
| unfaulted |(faulted VMA)|
|-----------|.............|
prev
2. Next VMA unfaulted:
copied -----|
v
|.............|-----------|
|(faulted VMA)| unfaulted |
|.............|-----------|
next
3. Both adjacent VMAs unfaulted:
copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| unfaulted |
|-----------|.............|-----------|
prev next
This series fixes each of these cases, and introduces self tests to assert
that the issues are corrected.
I also test a further case which was already handled, to assert that my
changes continues to correctly handle it:
4. prev unfaulted, next faulted:
copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| faulted |
|-----------|.............|-----------|
prev next
This bug was discovered via a syzbot report, linked to in the first patch
in the series, I confirmed that this series fixes the bug.
I also discovered that we are failing to check that the faulted VMA was
not forked when merging a copied VMA in cases 1-3 above, an issue this
series also addresses.
I also added self tests to assert that this is resolved (and confirmed
that the tests failed prior to this).
I also cleaned up vma_expand() as part of this work, renamed
vma_had_uncowed_parents() to vma_is_fork_child() as the previous name was
unduly confusing, and simplified the comments around this function.
This patch (of 4):
Commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA
merges") introduced the ability to merge previously unavailable VMA merge
scenarios.
The key piece of logic introduced was the ability to merge a faulted VMA
immediately next to an unfaulted VMA, which relies upon dup_anon_vma() to
correctly handle anon_vma state.
In the case of the merge of an existing VMA (that is changing properties
of a VMA and then merging if those properties are shared by adjacent
VMAs), dup_anon_vma() is invoked correctly.
However in the case of the merge of a new VMA, a corner case peculiar to
mremap() was missed.
The issue is that vma_expand() only performs dup_anon_vma() if the target
(the VMA that will ultimately become the merged VMA): is not the next VMA,
i.e. the one that appears after the range in which the new VMA is to be
established.
A key insight here is that in all other cases other than mremap(), a new
VMA merge either expands an existing VMA, meaning that the target VMA will
be that VMA, or would have anon_vma be NULL.
Specifically:
* __mmap_region() - no anon_vma in place, initial mapping.
* do_brk_flags() - expanding an existing VMA.
* vma_merge_extend() - expanding an existing VMA.
* relocate_vma_down() - no anon_vma in place, initial mapping.
In addition, we are in the unique situation of needing to duplicate
anon_vma state from a VMA that is neither the previous or next VMA being
merged with.
dup_anon_vma() deals exclusively with the target=unfaulted, src=faulted
case. This leaves four possibilities, in each case where the copied VMA
is faulted:
1. Previous VMA unfaulted:
copied -----|
v
|-----------|.............|
| unfaulted |(faulted VMA)|
|-----------|.............|
prev
target = prev, expand prev to cover.
2. Next VMA unfaulted:
copied -----|
v
|.............|-----------|
|(faulted VMA)| unfaulted |
|.............|-----------|
next
target = next, expand next to cover.
3. Both adjacent VMAs unfaulted:
copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| unfaulted |
|-----------|.............|-----------|
prev next
target = prev, expand prev to cover.
4. prev unfaulted, next faulted:
copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| faulted |
|-----------|.............|-----------|
prev next
target = prev, expand prev to cover. Essentially equivalent to 3, but
with additional requirement that next's anon_vma is the same as the copied
VMA's. This is covered by the existing logic.
To account for this very explicitly, we introduce
vma_merge_copied_range(), which sets a newly introduced vmg->copied_from
field, then invokes vma_merge_new_range() which handles the rest of the
logic.
We then update the key vma_expand() function to clean up the logic and
make what's going on clearer, making the 'remove next' case less special,
before invoking dup_anon_vma() unconditionally should we be copying from a
VMA.
Note that in case 3, the if (remove_next) ... branch will be a no-op, as
next=src in this instance and src is unfaulted.
In case 4, it won't be, but since in this instance next=src and it is
faulted, this will have required tgt=faulted, src=faulted to be
compatible, meaning that next->anon_vma == vmg->copied_from->anon_vma, and
thus a single dup_anon_vma() of next suffices to copy anon_vma state for
the copied-from VMA also.
If we are copying from a VMA in a successful merge we must _always_
propagate anon_vma state.
This issue can be observed most directly by invoked mremap() to move
around a VMA and cause this kind of merge with the MREMAP_DONTUNMAP flag
specified.
This will result in unlink_anon_vmas() being called after failing to
duplicate anon_vma state to the target VMA, which results in the anon_vma
itself being freed with folios still possessing dangling pointers to the
anon_vma and thus a use-after-free bug.
This bug was discovered via a syzbot report, which this patch resolves.
We further make a change to update the mergeable anon_vma check to assert
the copied-from anon_vma did not have CoW parents, as otherwise
dup_anon_vma() might incorrectly propagate CoW ancestors from the next VMA
in case 4 despite the anon_vma's being identical for both VMAs.
Link: https://lkml.kernel.org/r/cover.1767638272.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/b7930ad2b1503a657e29fe928eb33061d7eadf5b.17676382…
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges")
Reported-by: syzbot+b165fc2e11771c66d8ba(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/694a2745.050a0220.19928e.0017.GAE@google.com/
Cc: David Hildenbrand (Red Hat) <david(a)kernel.org>
Cc: Jann Horn <jannh(a)google.com>
Cc: Jeongjun Park <aha310510(a)gmail.com>
Cc: Yeoreum Yun <yeoreum.yun(a)arm.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Cc: Pedro Falcato <pfalcato(a)suse.de>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vma.c | 84 +++++++++++++++++++++++++++++++++++++----------------
mm/vma.h | 3 +
2 files changed, 62 insertions(+), 25 deletions(-)
--- a/mm/vma.c~mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge
+++ a/mm/vma.c
@@ -829,6 +829,8 @@ static __must_check struct vm_area_struc
VM_WARN_ON_VMG(middle &&
!(vma_iter_addr(vmg->vmi) >= middle->vm_start &&
vma_iter_addr(vmg->vmi) < middle->vm_end), vmg);
+ /* An existing merge can never be used by the mremap() logic. */
+ VM_WARN_ON_VMG(vmg->copied_from, vmg);
vmg->state = VMA_MERGE_NOMERGE;
@@ -1099,6 +1101,33 @@ struct vm_area_struct *vma_merge_new_ran
}
/*
+ * vma_merge_copied_range - Attempt to merge a VMA that is being copied by
+ * mremap()
+ *
+ * @vmg: Describes the VMA we are adding, in the copied-to range @vmg->start to
+ * @vmg->end (exclusive), which we try to merge with any adjacent VMAs if
+ * possible.
+ *
+ * vmg->prev, next, start, end, pgoff should all be relative to the COPIED TO
+ * range, i.e. the target range for the VMA.
+ *
+ * Returns: In instances where no merge was possible, NULL. Otherwise, a pointer
+ * to the VMA we expanded.
+ *
+ * ASSUMPTIONS: Same as vma_merge_new_range(), except vmg->middle must contain
+ * the copied-from VMA.
+ */
+static struct vm_area_struct *vma_merge_copied_range(struct vma_merge_struct *vmg)
+{
+ /* We must have a copied-from VMA. */
+ VM_WARN_ON_VMG(!vmg->middle, vmg);
+
+ vmg->copied_from = vmg->middle;
+ vmg->middle = NULL;
+ return vma_merge_new_range(vmg);
+}
+
+/*
* vma_expand - Expand an existing VMA
*
* @vmg: Describes a VMA expansion operation.
@@ -1117,46 +1146,52 @@ struct vm_area_struct *vma_merge_new_ran
int vma_expand(struct vma_merge_struct *vmg)
{
struct vm_area_struct *anon_dup = NULL;
- bool remove_next = false;
struct vm_area_struct *target = vmg->target;
struct vm_area_struct *next = vmg->next;
+ bool remove_next = false;
vm_flags_t sticky_flags;
-
- sticky_flags = vmg->vm_flags & VM_STICKY;
- sticky_flags |= target->vm_flags & VM_STICKY;
-
- VM_WARN_ON_VMG(!target, vmg);
+ int ret = 0;
mmap_assert_write_locked(vmg->mm);
-
vma_start_write(target);
- if (next && (target != next) && (vmg->end == next->vm_end)) {
- int ret;
- sticky_flags |= next->vm_flags & VM_STICKY;
+ if (next && target != next && vmg->end == next->vm_end)
remove_next = true;
- /* This should already have been checked by this point. */
- VM_WARN_ON_VMG(!can_merge_remove_vma(next), vmg);
- vma_start_write(next);
- /*
- * In this case we don't report OOM, so vmg->give_up_on_mm is
- * safe.
- */
- ret = dup_anon_vma(target, next, &anon_dup);
- if (ret)
- return ret;
- }
+ /* We must have a target. */
+ VM_WARN_ON_VMG(!target, vmg);
+ /* This should have already been checked by this point. */
+ VM_WARN_ON_VMG(remove_next && !can_merge_remove_vma(next), vmg);
/* Not merging but overwriting any part of next is not handled. */
VM_WARN_ON_VMG(next && !remove_next &&
next != target && vmg->end > next->vm_start, vmg);
- /* Only handles expanding */
+ /* Only handles expanding. */
VM_WARN_ON_VMG(target->vm_start < vmg->start ||
target->vm_end > vmg->end, vmg);
+ sticky_flags = vmg->vm_flags & VM_STICKY;
+ sticky_flags |= target->vm_flags & VM_STICKY;
if (remove_next)
- vmg->__remove_next = true;
+ sticky_flags |= next->vm_flags & VM_STICKY;
+
+ /*
+ * If we are removing the next VMA or copying from a VMA
+ * (e.g. mremap()'ing), we must propagate anon_vma state.
+ *
+ * Note that, by convention, callers ignore OOM for this case, so
+ * we don't need to account for vmg->give_up_on_mm here.
+ */
+ if (remove_next)
+ ret = dup_anon_vma(target, next, &anon_dup);
+ if (!ret && vmg->copied_from)
+ ret = dup_anon_vma(target, vmg->copied_from, &anon_dup);
+ if (ret)
+ return ret;
+ if (remove_next) {
+ vma_start_write(next);
+ vmg->__remove_next = true;
+ }
if (commit_merge(vmg))
goto nomem;
@@ -1828,10 +1863,9 @@ struct vm_area_struct *copy_vma(struct v
if (new_vma && new_vma->vm_start < addr + len)
return NULL; /* should never get here */
- vmg.middle = NULL; /* New VMA range. */
vmg.pgoff = pgoff;
vmg.next = vma_iter_next_rewind(&vmi, NULL);
- new_vma = vma_merge_new_range(&vmg);
+ new_vma = vma_merge_copied_range(&vmg);
if (new_vma) {
/*
--- a/mm/vma.h~mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge
+++ a/mm/vma.h
@@ -106,6 +106,9 @@ struct vma_merge_struct {
struct anon_vma_name *anon_name;
enum vma_merge_state state;
+ /* If copied from (i.e. mremap()'d) the VMA from which we are copying. */
+ struct vm_area_struct *copied_from;
+
/* Flags which callers can use to modify merge behaviour: */
/*
_
Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are
mm-vma-fix-anon_vma-uaf-on-mremap-faulted-unfaulted-merge.patch
tools-testing-selftests-add-tests-for-tgt-src-mremap-merges.patch
mm-vma-enforce-vma-fork-limit-on-unfaultedfaulted-mremap-merge-too.patch
tools-testing-selftests-add-forked-un-faulted-vma-merge-tests.patch