+cc Paul for hare-brained idea
On Tue, Nov 11, 2025 at 04:56:05PM -0500, Liam R. Howlett wrote:
The retry in lock_vma_under_rcu() drops the rcu read lock before reacquiring the lock and trying again. This may cause a use-after-free if the maple node the maple state was using was freed.
The maple state is protected by the rcu read lock. When the lock is dropped, the state cannot be reused as it tracks pointers to objects that may be freed during the time where the lock was not held.
Any time the rcu read lock is dropped, the maple state must be invalidated. Resetting the address and state to MA_START is the safest course of action, which will result in the next operation starting from the top of the tree.
Since we all missed it I do wonder if we need some super clear comment saying 'hey if you drop + re-acquire RCU lock you MUST revalidate mas state by doing 'blah'.
I think one source of confusion for me with maple tree operations is - what to do if we are in a position where some kind of reset is needed?
So even if I'd realised 'aha we need to reset this' it wouldn't be obvious to me that we ought to set to the address.
I guess a mas_reset() would keep mas->index, last where they where which also wouldn't be right would it?
In which case a mas_reset() is _also_ not a valid operation if invoked after dropping/reacquiring the RCU lock right?
Prior to commit 0b16f8bed19c ("mm: change vma_start_read() to drop RCU lock on failure"), the rcu read lock was dropped and NULL was returned, so the retry would not have happened. However, now that the read lock is dropped regardless of the return, we may use a freed maple tree node cached in the maple state on retry.
Cc: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org Fixes: 0b16f8bed19c ("mm: change vma_start_read() to drop RCU lock on failure") Reported-by: syzbot+131f9eb2b5807573275c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=131f9eb2b5807573275c Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com
The reasoning seems sensible & LGTM, so:
Reviewed-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
mm/mmap_lock.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 39f341caf32c0..f2532af6208c0 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -257,6 +257,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, if (PTR_ERR(vma) == -EAGAIN) { count_vm_vma_lock_event(VMA_LOCK_MISS); /* The area was replaced with another one */
mas_set(&mas, address);
I wonder if we could detect that the RCU lock was released (+ reacquired) in mas_walk() in a debug mode, like CONFIG_VM_DEBUG_MAPLE_TREE?
Not sure if that's feasible, maybe Paul can comment? :)
I think Vlastimil made a similar kind of comment possibly off-list.
Would there be much overhead if we just did this:
retry: rcu_read_lock(); mas_set(&mas, address); vma = mas_walk(&mas);
The retry path will be super rare, and I think the compiler should be smart enough to not assign index, last twice and this would protect us.
Then we could have some function like:
mas_walk_from(&mas, address);
That did this.
Or, since we _literally_ only use mas for this one walk, have a simple function like:
/** * ... * Performs a single walk of a maple tree to the specified address, * returning a pointer to an entry if found, or NULL if not. * ... */ static void *mt_walk(struct maple_tree *mt, unsigned long address) { MA_STATE(mas, mt, address, adderss);
lockdep_assert_in_rcu_read_lock(); return mas_walk(&mas); }
That literally just does the walk as a one-off?
goto retry; }-- 2.47.2
Cheers, Lorenzo