By inducing delays in the right places, Jann Horn created a reproducer for a hard to hit UAF issue that became possible after VMAs were allowed to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
Race description is borrowed from Jann's discovery report: lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under rcu_read_lock(). At that point, the VMA may be concurrently freed, and it can be recycled by another process. vma_start_read() then increments the vma->vm_refcnt (if it is in an acceptable range), and if this succeeds, vma_start_read() can return a recycled VMA.
In this scenario where the VMA has been recycled, lock_vma_under_rcu() will then detect the mismatching ->vm_mm pointer and drop the VMA through vma_end_read(), which calls vma_refcount_put(). vma_refcount_put() drops the refcount and then calls rcuwait_wake_up() using a copy of vma->vm_mm. This is wrong: It implicitly assumes that the caller is keeping the VMA's mm alive, but in this scenario the caller has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
The diagram depicting the race: T1 T2 T3 == == == lock_vma_under_rcu mas_walk <VMA gets removed from mm> mmap <the same VMA is reallocated> vma_start_read __refcount_inc_not_zero_limited_acquire munmap __vma_enter_locked refcount_add_not_zero vma_end_read vma_refcount_put __refcount_dec_and_test rcuwait_wait_event <finish operation> rcuwait_wake_up [UAF]
Note that rcuwait_wait_event() in T3 does not block because refcount was already dropped by T1. At this point T3 can exit and free the mm causing UAF in T1. To avoid this we move vma->vm_mm verification into vma_start_read() and grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU") Reported-by: Jann Horn jannh@google.com Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m... Signed-off-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org --- Changes since v1 [1] - Made a copy of vma->mm before using it in vma_start_read(), per Vlastimil Babka
Notes: - Applies cleanly over mm-unstable. - Should be applied to 6.15 and 6.16 but these branches do not have lock_next_vma() function, so the change in lock_next_vma() should be skipped when applying to those branches.
[1] https://lore.kernel.org/all/20250728170950.2216966-1-surenb@google.com/
include/linux/mmap_lock.h | 23 +++++++++++++++++++++++ mm/mmap_lock.c | 10 +++------- 2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 1f4f44951abe..da34afa2f8ef 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w); #include <linux/tracepoint-defs.h> #include <linux/types.h> #include <linux/cleanup.h> +#include <linux/sched/mm.h>
#define MMAP_LOCK_INITIALIZER(name) \ .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), @@ -183,6 +184,28 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm, }
rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_); + + /* + * If vma got attached to another mm from under us, that mm is not + * stable and can be freed in the narrow window after vma->vm_refcnt + * is dropped and before rcuwait_wake_up(mm) is called. Grab it before + * releasing vma->vm_refcnt. + */ + if (unlikely(vma->vm_mm != mm)) { + /* Use a copy of vm_mm in case vma is freed after we drop vm_refcnt */ + struct mm_struct *other_mm = vma->vm_mm; + /* + * __mmdrop() is a heavy operation and we don't need RCU + * protection here. Release RCU lock during these operations. + */ + rcu_read_unlock(); + mmgrab(other_mm); + vma_refcount_put(vma); + mmdrop(other_mm); + rcu_read_lock(); + return NULL; + } + /* * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result. * False unlocked result is impossible because we modify and check diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 729fb7d0dd59..aa3bc42ecde0 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, */
/* Check if the vma we locked is the right one. */ - if (unlikely(vma->vm_mm != mm || - address < vma->vm_start || address >= vma->vm_end)) + if (unlikely(address < vma->vm_start || address >= vma->vm_end)) goto inval_end_read;
rcu_read_unlock(); @@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm, goto fallback; }
- /* - * Verify the vma we locked belongs to the same address space and it's - * not behind of the last search position. - */ - if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end)) + /* Verify the vma is not behind of the last search position. */ + if (unlikely(from_addr >= vma->vm_end)) goto fallback_unlock;
/*
base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
On Mon, 28 Jul 2025 10:53:55 -0700 Suren Baghdasaryan surenb@google.com wrote:
By inducing delays in the right places, Jann Horn created a reproducer for a hard to hit UAF issue that became possible after VMAs were allowed to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
Race description is borrowed from Jann's discovery report: lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under rcu_read_lock(). At that point, the VMA may be concurrently freed, and it can be recycled by another process. vma_start_read() then increments the vma->vm_refcnt (if it is in an acceptable range), and if this succeeds, vma_start_read() can return a recycled VMA.
In this scenario where the VMA has been recycled, lock_vma_under_rcu() will then detect the mismatching ->vm_mm pointer and drop the VMA through vma_end_read(), which calls vma_refcount_put(). vma_refcount_put() drops the refcount and then calls rcuwait_wake_up() using a copy of vma->vm_mm. This is wrong: It implicitly assumes that the caller is keeping the VMA's mm alive, but in this scenario the caller has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
The diagram depicting the race: T1 T2 T3 == == == lock_vma_under_rcu mas_walk <VMA gets removed from mm> mmap <the same VMA is reallocated> vma_start_read __refcount_inc_not_zero_limited_acquire munmap __vma_enter_locked refcount_add_not_zero vma_end_read vma_refcount_put __refcount_dec_and_test rcuwait_wait_event <finish operation> rcuwait_wake_up [UAF]
Note that rcuwait_wait_event() in T3 does not block because refcount was already dropped by T1. At this point T3 can exit and free the mm causing UAF in T1. To avoid this we move vma->vm_mm verification into vma_start_read() and grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
Thanks, I'll add this to mm-unstable with a plan to include it in the second batch of MM-updates->Linus next week.
The patch won't apply to 6.15 so I expect the -stable maintainers will be asking you for a backportable version.
On 7/28/25 19:53, Suren Baghdasaryan wrote:
By inducing delays in the right places, Jann Horn created a reproducer for a hard to hit UAF issue that became possible after VMAs were allowed to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
Race description is borrowed from Jann's discovery report: lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under rcu_read_lock(). At that point, the VMA may be concurrently freed, and it can be recycled by another process. vma_start_read() then increments the vma->vm_refcnt (if it is in an acceptable range), and if this succeeds, vma_start_read() can return a recycled VMA.
In this scenario where the VMA has been recycled, lock_vma_under_rcu() will then detect the mismatching ->vm_mm pointer and drop the VMA through vma_end_read(), which calls vma_refcount_put(). vma_refcount_put() drops the refcount and then calls rcuwait_wake_up() using a copy of vma->vm_mm. This is wrong: It implicitly assumes that the caller is keeping the VMA's mm alive, but in this scenario the caller has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
The diagram depicting the race: T1 T2 T3 == == == lock_vma_under_rcu mas_walk <VMA gets removed from mm> mmap <the same VMA is reallocated> vma_start_read __refcount_inc_not_zero_limited_acquire munmap __vma_enter_locked refcount_add_not_zero vma_end_read vma_refcount_put __refcount_dec_and_test rcuwait_wait_event <finish operation> rcuwait_wake_up [UAF]
Note that rcuwait_wait_event() in T3 does not block because refcount was already dropped by T1. At this point T3 can exit and free the mm causing UAF in T1. To avoid this we move vma->vm_mm verification into vma_start_read() and grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU") Reported-by: Jann Horn jannh@google.com Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m... Signed-off-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org
Reviewed-by: Vlastimil Babka vbabka@suse.cz
Changes since v1 [1]
- Made a copy of vma->mm before using it in vma_start_read(),
per Vlastimil Babka
Notes:
- Applies cleanly over mm-unstable.
- Should be applied to 6.15 and 6.16 but these branches do not
have lock_next_vma() function, so the change in lock_next_vma() should be skipped when applying to those branches.
[1] https://lore.kernel.org/all/20250728170950.2216966-1-surenb@google.com/
include/linux/mmap_lock.h | 23 +++++++++++++++++++++++ mm/mmap_lock.c | 10 +++------- 2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 1f4f44951abe..da34afa2f8ef 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w); #include <linux/tracepoint-defs.h> #include <linux/types.h> #include <linux/cleanup.h> +#include <linux/sched/mm.h> #define MMAP_LOCK_INITIALIZER(name) \ .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), @@ -183,6 +184,28 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm, } rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
- /*
* If vma got attached to another mm from under us, that mm is not
* stable and can be freed in the narrow window after vma->vm_refcnt
* is dropped and before rcuwait_wake_up(mm) is called. Grab it before
* releasing vma->vm_refcnt.
*/
- if (unlikely(vma->vm_mm != mm)) {
/* Use a copy of vm_mm in case vma is freed after we drop vm_refcnt */
struct mm_struct *other_mm = vma->vm_mm;
/*
* __mmdrop() is a heavy operation and we don't need RCU
* protection here. Release RCU lock during these operations.
*/
rcu_read_unlock();
mmgrab(other_mm);
vma_refcount_put(vma);
mmdrop(other_mm);
rcu_read_lock();
return NULL;
- }
- /*
- Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
- False unlocked result is impossible because we modify and check
diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 729fb7d0dd59..aa3bc42ecde0 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, */ /* Check if the vma we locked is the right one. */
- if (unlikely(vma->vm_mm != mm ||
address < vma->vm_start || address >= vma->vm_end))
- if (unlikely(address < vma->vm_start || address >= vma->vm_end)) goto inval_end_read;
rcu_read_unlock(); @@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm, goto fallback; }
- /*
* Verify the vma we locked belongs to the same address space and it's
* not behind of the last search position.
*/
- if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
- /* Verify the vma is not behind of the last search position. */
- if (unlikely(from_addr >= vma->vm_end)) goto fallback_unlock;
/*
base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
On 7/28/25 19:53, Suren Baghdasaryan wrote:
By inducing delays in the right places, Jann Horn created a reproducer for a hard to hit UAF issue that became possible after VMAs were allowed to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
Race description is borrowed from Jann's discovery report: lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under rcu_read_lock(). At that point, the VMA may be concurrently freed, and it can be recycled by another process. vma_start_read() then increments the vma->vm_refcnt (if it is in an acceptable range), and if this succeeds, vma_start_read() can return a recycled VMA.
In this scenario where the VMA has been recycled, lock_vma_under_rcu() will then detect the mismatching ->vm_mm pointer and drop the VMA through vma_end_read(), which calls vma_refcount_put(). vma_refcount_put() drops the refcount and then calls rcuwait_wake_up() using a copy of vma->vm_mm. This is wrong: It implicitly assumes that the caller is keeping the VMA's mm alive, but in this scenario the caller has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
The diagram depicting the race: T1 T2 T3 == == == lock_vma_under_rcu mas_walk <VMA gets removed from mm> mmap <the same VMA is reallocated> vma_start_read __refcount_inc_not_zero_limited_acquire munmap __vma_enter_locked refcount_add_not_zero vma_end_read vma_refcount_put __refcount_dec_and_test rcuwait_wait_event <finish operation> rcuwait_wake_up [UAF]
Note that rcuwait_wait_event() in T3 does not block because refcount was already dropped by T1. At this point T3 can exit and free the mm causing UAF in T1. To avoid this we move vma->vm_mm verification into vma_start_read() and grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU") Reported-by: Jann Horn jannh@google.com Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m... Signed-off-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org
As for further steps, considered replying to [1] but maybe it's better here.
As for a KISS fix including stable, great. Seems a nice improvement to actually handle "vma->vm_mm != mm" in vma_start_read() like this - good idea!
Less great is that there's now a subtle assumption that some (but not all!) cases where vma_start_read() returns NULL imply that we dropped the rcu lock. And it doesn't matter as the callers abort or fallback to mmap sem anyway in that case. Hopefully we can improve that a bit.
The idea of moving rcu lock and mas walk inside vma_start_read() is indeed busted with lock_next_vma(). The iterator difference could be perhaps solved by having lock_vma_under_rcu() set up its own one (instead of MA_STATE) in a way that vma_next() would do the right thing for it. However there would still be the difference that lock_next_vma() expects we are already under rcu lock, and lock_vma_under_rcu() doesn't.
So what we can perhaps do instead is move vma_start_read() to mm/mmap_lock.c (no other users so why expose it in a header for potential misuse). And then indeed just make it drop rcu lock completely (and not reacquire) any time it's returning NULL, document that and adjust callers to that. I think it's less subtle than dropping and reacquring, and should simplify the error handling in the callers a bit.
[1] https://lore.kernel.org/all/CAJuCfpEMhGe_eZuFm__4CDstM9%3DOG2kTUTziNL-f%3DM3...
Changes since v1 [1]
- Made a copy of vma->mm before using it in vma_start_read(),
per Vlastimil Babka
Notes:
- Applies cleanly over mm-unstable.
- Should be applied to 6.15 and 6.16 but these branches do not
have lock_next_vma() function, so the change in lock_next_vma() should be skipped when applying to those branches.
[1] https://lore.kernel.org/all/20250728170950.2216966-1-surenb@google.com/
include/linux/mmap_lock.h | 23 +++++++++++++++++++++++ mm/mmap_lock.c | 10 +++------- 2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 1f4f44951abe..da34afa2f8ef 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w); #include <linux/tracepoint-defs.h> #include <linux/types.h> #include <linux/cleanup.h> +#include <linux/sched/mm.h> #define MMAP_LOCK_INITIALIZER(name) \ .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), @@ -183,6 +184,28 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm, } rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
- /*
* If vma got attached to another mm from under us, that mm is not
* stable and can be freed in the narrow window after vma->vm_refcnt
* is dropped and before rcuwait_wake_up(mm) is called. Grab it before
* releasing vma->vm_refcnt.
*/
- if (unlikely(vma->vm_mm != mm)) {
/* Use a copy of vm_mm in case vma is freed after we drop vm_refcnt */
struct mm_struct *other_mm = vma->vm_mm;
/*
* __mmdrop() is a heavy operation and we don't need RCU
* protection here. Release RCU lock during these operations.
*/
rcu_read_unlock();
mmgrab(other_mm);
vma_refcount_put(vma);
mmdrop(other_mm);
rcu_read_lock();
return NULL;
- }
- /*
- Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
- False unlocked result is impossible because we modify and check
diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 729fb7d0dd59..aa3bc42ecde0 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, */ /* Check if the vma we locked is the right one. */
- if (unlikely(vma->vm_mm != mm ||
address < vma->vm_start || address >= vma->vm_end))
- if (unlikely(address < vma->vm_start || address >= vma->vm_end)) goto inval_end_read;
rcu_read_unlock(); @@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm, goto fallback; }
- /*
* Verify the vma we locked belongs to the same address space and it's
* not behind of the last search position.
*/
- if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
- /* Verify the vma is not behind of the last search position. */
- if (unlikely(from_addr >= vma->vm_end)) goto fallback_unlock;
/*
base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
On Mon, Jul 28, 2025 at 10:04 PM Vlastimil Babka vbabka@suse.cz wrote:
On 7/28/25 19:53, Suren Baghdasaryan wrote:
By inducing delays in the right places, Jann Horn created a reproducer for a hard to hit UAF issue that became possible after VMAs were allowed to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
Race description is borrowed from Jann's discovery report: lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under rcu_read_lock(). At that point, the VMA may be concurrently freed, and it can be recycled by another process. vma_start_read() then increments the vma->vm_refcnt (if it is in an acceptable range), and if this succeeds, vma_start_read() can return a recycled VMA.
In this scenario where the VMA has been recycled, lock_vma_under_rcu() will then detect the mismatching ->vm_mm pointer and drop the VMA through vma_end_read(), which calls vma_refcount_put(). vma_refcount_put() drops the refcount and then calls rcuwait_wake_up() using a copy of vma->vm_mm. This is wrong: It implicitly assumes that the caller is keeping the VMA's mm alive, but in this scenario the caller has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
The diagram depicting the race: T1 T2 T3 == == == lock_vma_under_rcu mas_walk <VMA gets removed from mm> mmap <the same VMA is reallocated> vma_start_read __refcount_inc_not_zero_limited_acquire munmap __vma_enter_locked refcount_add_not_zero vma_end_read vma_refcount_put __refcount_dec_and_test rcuwait_wait_event <finish operation> rcuwait_wake_up [UAF]
Note that rcuwait_wait_event() in T3 does not block because refcount was already dropped by T1. At this point T3 can exit and free the mm causing UAF in T1. To avoid this we move vma->vm_mm verification into vma_start_read() and grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU") Reported-by: Jann Horn jannh@google.com Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m... Signed-off-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org
As for further steps, considered replying to [1] but maybe it's better here.
As for a KISS fix including stable, great. Seems a nice improvement to actually handle "vma->vm_mm != mm" in vma_start_read() like this - good idea!
Less great is that there's now a subtle assumption that some (but not all!) cases where vma_start_read() returns NULL imply that we dropped the rcu lock. And it doesn't matter as the callers abort or fallback to mmap sem anyway in that case. Hopefully we can improve that a bit.
The idea of moving rcu lock and mas walk inside vma_start_read() is indeed busted with lock_next_vma(). The iterator difference could be perhaps solved by having lock_vma_under_rcu() set up its own one (instead of MA_STATE) in a way that vma_next() would do the right thing for it. However there would still be the difference that lock_next_vma() expects we are already under rcu lock, and lock_vma_under_rcu() doesn't.
So what we can perhaps do instead is move vma_start_read() to mm/mmap_lock.c (no other users so why expose it in a header for potential misuse). And then indeed just make it drop rcu lock completely (and not reacquire) any time it's returning NULL, document that and adjust callers to that. I think it's less subtle than dropping and reacquring, and should simplify the error handling in the callers a bit.
Thanks for the suggestion. That was actually exactly one of the options I was considering but I thought this dropping RCU schema would still be uglier than dropping and reacquiring the RCU lock. If you think otherwise I can make the change as you suggested for mm-unstable and keep this original change for stable only. Should I do that?
[1] https://lore.kernel.org/all/CAJuCfpEMhGe_eZuFm__4CDstM9%3DOG2kTUTziNL-f%3DM3...
Changes since v1 [1]
- Made a copy of vma->mm before using it in vma_start_read(),
per Vlastimil Babka
Notes:
- Applies cleanly over mm-unstable.
- Should be applied to 6.15 and 6.16 but these branches do not
have lock_next_vma() function, so the change in lock_next_vma() should be skipped when applying to those branches.
[1] https://lore.kernel.org/all/20250728170950.2216966-1-surenb@google.com/
include/linux/mmap_lock.h | 23 +++++++++++++++++++++++ mm/mmap_lock.c | 10 +++------- 2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 1f4f44951abe..da34afa2f8ef 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w); #include <linux/tracepoint-defs.h> #include <linux/types.h> #include <linux/cleanup.h> +#include <linux/sched/mm.h>
#define MMAP_LOCK_INITIALIZER(name) \ .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), @@ -183,6 +184,28 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm, }
rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
/*
* If vma got attached to another mm from under us, that mm is not
* stable and can be freed in the narrow window after vma->vm_refcnt
* is dropped and before rcuwait_wake_up(mm) is called. Grab it before
* releasing vma->vm_refcnt.
*/
if (unlikely(vma->vm_mm != mm)) {
/* Use a copy of vm_mm in case vma is freed after we drop vm_refcnt */
struct mm_struct *other_mm = vma->vm_mm;
/*
* __mmdrop() is a heavy operation and we don't need RCU
* protection here. Release RCU lock during these operations.
*/
rcu_read_unlock();
mmgrab(other_mm);
vma_refcount_put(vma);
mmdrop(other_mm);
rcu_read_lock();
return NULL;
}
/* * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result. * False unlocked result is impossible because we modify and check
diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 729fb7d0dd59..aa3bc42ecde0 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, */
/* Check if the vma we locked is the right one. */
if (unlikely(vma->vm_mm != mm ||
address < vma->vm_start || address >= vma->vm_end))
if (unlikely(address < vma->vm_start || address >= vma->vm_end)) goto inval_end_read; rcu_read_unlock();
@@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm, goto fallback; }
/*
* Verify the vma we locked belongs to the same address space and it's
* not behind of the last search position.
*/
if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
/* Verify the vma is not behind of the last search position. */
if (unlikely(from_addr >= vma->vm_end)) goto fallback_unlock; /*
base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
On 7/29/25 00:53, Suren Baghdasaryan wrote:
On Mon, Jul 28, 2025 at 10:04 PM Vlastimil Babka vbabka@suse.cz wrote:
As for further steps, considered replying to [1] but maybe it's better here.
As for a KISS fix including stable, great. Seems a nice improvement to actually handle "vma->vm_mm != mm" in vma_start_read() like this - good idea!
Less great is that there's now a subtle assumption that some (but not all!) cases where vma_start_read() returns NULL imply that we dropped the rcu lock. And it doesn't matter as the callers abort or fallback to mmap sem anyway in that case. Hopefully we can improve that a bit.
The idea of moving rcu lock and mas walk inside vma_start_read() is indeed busted with lock_next_vma(). The iterator difference could be perhaps solved by having lock_vma_under_rcu() set up its own one (instead of MA_STATE) in a way that vma_next() would do the right thing for it. However there would still be the difference that lock_next_vma() expects we are already under rcu lock, and lock_vma_under_rcu() doesn't.
So what we can perhaps do instead is move vma_start_read() to mm/mmap_lock.c (no other users so why expose it in a header for potential misuse). And then indeed just make it drop rcu lock completely (and not reacquire) any time it's returning NULL, document that and adjust callers to that. I think it's less subtle than dropping and reacquring, and should simplify the error handling in the callers a bit.
Thanks for the suggestion. That was actually exactly one of the options I was considering but I thought this dropping RCU schema would still be uglier than dropping and reacquiring the RCU lock. If you think otherwise I can make the change as you suggested for mm-unstable and keep this original change for stable only. Should I do that?
If we decide anything, I would do it as a cleanup on top of the fix that will now go to mainline and then stable. We don't want to deviate for stable unnecessarily (removing an extraneous hunk in stable backport is fine).
As for which case is uglier, I don't know really. Dropping and reacquiring rcy lock in very rare cases, leading to even rarer situations where it would cause an issue, seems more dangerous to me than just dropping everytime we return NULL for any of the reasons, which is hopefully less rare and an error such as caller trying to drop rcu lock again will blow up immediately. Maybe others have opinions...
[1] https://lore.kernel.org/all/CAJuCfpEMhGe_eZuFm__4CDstM9%3DOG2kTUTziNL-f%3DM3...
Changes since v1 [1]
- Made a copy of vma->mm before using it in vma_start_read(),
per Vlastimil Babka
Notes:
- Applies cleanly over mm-unstable.
- Should be applied to 6.15 and 6.16 but these branches do not
have lock_next_vma() function, so the change in lock_next_vma() should be skipped when applying to those branches.
[1] https://lore.kernel.org/all/20250728170950.2216966-1-surenb@google.com/
include/linux/mmap_lock.h | 23 +++++++++++++++++++++++ mm/mmap_lock.c | 10 +++------- 2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 1f4f44951abe..da34afa2f8ef 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w); #include <linux/tracepoint-defs.h> #include <linux/types.h> #include <linux/cleanup.h> +#include <linux/sched/mm.h>
#define MMAP_LOCK_INITIALIZER(name) \ .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), @@ -183,6 +184,28 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm, }
rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
/*
* If vma got attached to another mm from under us, that mm is not
* stable and can be freed in the narrow window after vma->vm_refcnt
* is dropped and before rcuwait_wake_up(mm) is called. Grab it before
* releasing vma->vm_refcnt.
*/
if (unlikely(vma->vm_mm != mm)) {
/* Use a copy of vm_mm in case vma is freed after we drop vm_refcnt */
struct mm_struct *other_mm = vma->vm_mm;
/*
* __mmdrop() is a heavy operation and we don't need RCU
* protection here. Release RCU lock during these operations.
*/
rcu_read_unlock();
mmgrab(other_mm);
vma_refcount_put(vma);
mmdrop(other_mm);
rcu_read_lock();
return NULL;
}
/* * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result. * False unlocked result is impossible because we modify and check
diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 729fb7d0dd59..aa3bc42ecde0 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, */
/* Check if the vma we locked is the right one. */
if (unlikely(vma->vm_mm != mm ||
address < vma->vm_start || address >= vma->vm_end))
if (unlikely(address < vma->vm_start || address >= vma->vm_end)) goto inval_end_read; rcu_read_unlock();
@@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm, goto fallback; }
/*
* Verify the vma we locked belongs to the same address space and it's
* not behind of the last search position.
*/
if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
/* Verify the vma is not behind of the last search position. */
if (unlikely(from_addr >= vma->vm_end)) goto fallback_unlock; /*
base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
On Tue, Jul 29, 2025 at 5:03 AM Vlastimil Babka vbabka@suse.cz wrote:
On 7/29/25 00:53, Suren Baghdasaryan wrote:
On Mon, Jul 28, 2025 at 10:04 PM Vlastimil Babka vbabka@suse.cz wrote:
As for further steps, considered replying to [1] but maybe it's better here.
As for a KISS fix including stable, great. Seems a nice improvement to actually handle "vma->vm_mm != mm" in vma_start_read() like this - good idea!
Less great is that there's now a subtle assumption that some (but not all!) cases where vma_start_read() returns NULL imply that we dropped the rcu lock. And it doesn't matter as the callers abort or fallback to mmap sem anyway in that case. Hopefully we can improve that a bit.
The idea of moving rcu lock and mas walk inside vma_start_read() is indeed busted with lock_next_vma(). The iterator difference could be perhaps solved by having lock_vma_under_rcu() set up its own one (instead of MA_STATE) in a way that vma_next() would do the right thing for it. However there would still be the difference that lock_next_vma() expects we are already under rcu lock, and lock_vma_under_rcu() doesn't.
So what we can perhaps do instead is move vma_start_read() to mm/mmap_lock.c (no other users so why expose it in a header for potential misuse). And then indeed just make it drop rcu lock completely (and not reacquire) any time it's returning NULL, document that and adjust callers to that. I think it's less subtle than dropping and reacquring, and should simplify the error handling in the callers a bit.
Thanks for the suggestion. That was actually exactly one of the options I was considering but I thought this dropping RCU schema would still be uglier than dropping and reacquiring the RCU lock. If you think otherwise I can make the change as you suggested for mm-unstable and keep this original change for stable only. Should I do that?
If we decide anything, I would do it as a cleanup on top of the fix that will now go to mainline and then stable. We don't want to deviate for stable unnecessarily (removing an extraneous hunk in stable backport is fine).
Good point. I'll do the refactoring on top of this fix and that one does not need to go into stable branch.
As for which case is uglier, I don't know really. Dropping and reacquiring rcy lock in very rare cases, leading to even rarer situations where it would cause an issue, seems more dangerous to me than just dropping everytime we return NULL for any of the reasons, which is hopefully less rare and an error such as caller trying to drop rcu lock again will blow up immediately. Maybe others have opinions...
Yeah, you are probably right. It would be more explicit that the VMA can't be used after a failed vma_start_read() if vma_start_read() were to always drop RCU on exit. Either way I think I should add a big fat warning to vma_start_read() comment that the VMA passed to vma_start_read() can't be used if this function fails to lock it. I'll post a v3 with that comment added and Lorenzo's nits addressed, then I'll prepare a refactoring patchset for mm-unstable to drop RCU upon vma_start_read() exit.
[1] https://lore.kernel.org/all/CAJuCfpEMhGe_eZuFm__4CDstM9%3DOG2kTUTziNL-f%3DM3...
Changes since v1 [1]
- Made a copy of vma->mm before using it in vma_start_read(),
per Vlastimil Babka
Notes:
- Applies cleanly over mm-unstable.
- Should be applied to 6.15 and 6.16 but these branches do not
have lock_next_vma() function, so the change in lock_next_vma() should be skipped when applying to those branches.
[1] https://lore.kernel.org/all/20250728170950.2216966-1-surenb@google.com/
include/linux/mmap_lock.h | 23 +++++++++++++++++++++++ mm/mmap_lock.c | 10 +++------- 2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 1f4f44951abe..da34afa2f8ef 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w); #include <linux/tracepoint-defs.h> #include <linux/types.h> #include <linux/cleanup.h> +#include <linux/sched/mm.h>
#define MMAP_LOCK_INITIALIZER(name) \ .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), @@ -183,6 +184,28 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm, }
rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
/*
* If vma got attached to another mm from under us, that mm is not
* stable and can be freed in the narrow window after vma->vm_refcnt
* is dropped and before rcuwait_wake_up(mm) is called. Grab it before
* releasing vma->vm_refcnt.
*/
if (unlikely(vma->vm_mm != mm)) {
/* Use a copy of vm_mm in case vma is freed after we drop vm_refcnt */
struct mm_struct *other_mm = vma->vm_mm;
/*
* __mmdrop() is a heavy operation and we don't need RCU
* protection here. Release RCU lock during these operations.
*/
rcu_read_unlock();
mmgrab(other_mm);
vma_refcount_put(vma);
mmdrop(other_mm);
rcu_read_lock();
return NULL;
}
/* * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result. * False unlocked result is impossible because we modify and check
diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 729fb7d0dd59..aa3bc42ecde0 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, */
/* Check if the vma we locked is the right one. */
if (unlikely(vma->vm_mm != mm ||
address < vma->vm_start || address >= vma->vm_end))
if (unlikely(address < vma->vm_start || address >= vma->vm_end)) goto inval_end_read; rcu_read_unlock();
@@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm, goto fallback; }
/*
* Verify the vma we locked belongs to the same address space and it's
* not behind of the last search position.
*/
if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
/* Verify the vma is not behind of the last search position. */
if (unlikely(from_addr >= vma->vm_end)) goto fallback_unlock; /*
base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
On Mon, Jul 28, 2025 at 10:53:55AM -0700, Suren Baghdasaryan wrote:
By inducing delays in the right places, Jann Horn created a reproducer for a hard to hit UAF issue that became possible after VMAs were allowed to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
Race description is borrowed from Jann's discovery report: lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under rcu_read_lock(). At that point, the VMA may be concurrently freed, and it can be recycled by another process. vma_start_read() then increments the vma->vm_refcnt (if it is in an acceptable range), and if this succeeds, vma_start_read() can return a recycled VMA.
In this scenario where the VMA has been recycled, lock_vma_under_rcu() will then detect the mismatching ->vm_mm pointer and drop the VMA through vma_end_read(), which calls vma_refcount_put(). vma_refcount_put() drops the refcount and then calls rcuwait_wake_up() using a copy of vma->vm_mm. This is wrong: It implicitly assumes that the caller is keeping the VMA's mm alive, but in this scenario the caller has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
The diagram depicting the race: T1 T2 T3 == == == lock_vma_under_rcu mas_walk <VMA gets removed from mm> mmap <the same VMA is reallocated> vma_start_read __refcount_inc_not_zero_limited_acquire munmap __vma_enter_locked refcount_add_not_zero vma_end_read vma_refcount_put __refcount_dec_and_test rcuwait_wait_event <finish operation> rcuwait_wake_up [UAF]
Note that rcuwait_wait_event() in T3 does not block because refcount was already dropped by T1. At this point T3 can exit and free the mm causing UAF in T1. To avoid this we move vma->vm_mm verification into vma_start_read() and grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU") Reported-by: Jann Horn jannh@google.com Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m... Signed-off-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org
This LGTM AFAICT, so:
Acked-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
I'll fold a description of this check into the detailed impl notes I'm writing.
Changes since v1 [1]
- Made a copy of vma->mm before using it in vma_start_read(),
per Vlastimil Babka
Notes:
- Applies cleanly over mm-unstable.
- Should be applied to 6.15 and 6.16 but these branches do not
have lock_next_vma() function, so the change in lock_next_vma() should be skipped when applying to those branches.
[1] https://lore.kernel.org/all/20250728170950.2216966-1-surenb@google.com/
include/linux/mmap_lock.h | 23 +++++++++++++++++++++++ mm/mmap_lock.c | 10 +++------- 2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 1f4f44951abe..da34afa2f8ef 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w); #include <linux/tracepoint-defs.h> #include <linux/types.h> #include <linux/cleanup.h> +#include <linux/sched/mm.h>
#define MMAP_LOCK_INITIALIZER(name) \ .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), @@ -183,6 +184,28 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm, }
rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
- /*
* If vma got attached to another mm from under us, that mm is not
* stable and can be freed in the narrow window after vma->vm_refcnt
* is dropped and before rcuwait_wake_up(mm) is called. Grab it before
* releasing vma->vm_refcnt.
*/
- if (unlikely(vma->vm_mm != mm)) {
/* Use a copy of vm_mm in case vma is freed after we drop vm_refcnt */
struct mm_struct *other_mm = vma->vm_mm;
NIT: Not sure if we should have a space before the comment below. But it doesn't matter... :)
/*
* __mmdrop() is a heavy operation and we don't need RCU
* protection here. Release RCU lock during these operations.
*/
NIT: Maybe worth saying 'we reinstate the RCU read lock as the caller expects it to be held when this functino returns even on error' or something like this.
rcu_read_unlock();
mmgrab(other_mm);
vma_refcount_put(vma);
mmdrop(other_mm);
rcu_read_lock();
return NULL;
- }
- /*
- Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
- False unlocked result is impossible because we modify and check
diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 729fb7d0dd59..aa3bc42ecde0 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, */
/* Check if the vma we locked is the right one. */
- if (unlikely(vma->vm_mm != mm ||
address < vma->vm_start || address >= vma->vm_end))
if (unlikely(address < vma->vm_start || address >= vma->vm_end)) goto inval_end_read;
rcu_read_unlock();
@@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm, goto fallback; }
- /*
* Verify the vma we locked belongs to the same address space and it's
* not behind of the last search position.
*/
- if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
- /* Verify the vma is not behind of the last search position. */
NIT: 'behind of' should be 'behind', 'behind of' is weird sounding here
if (unlikely(from_addr >= vma->vm_end)) goto fallback_unlock;
/*
base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
2.50.1.487.gc89ff58d15-goog
On Tue, Jul 29, 2025 at 2:57 AM Lorenzo Stoakes lorenzo.stoakes@oracle.com wrote:
On Mon, Jul 28, 2025 at 10:53:55AM -0700, Suren Baghdasaryan wrote:
By inducing delays in the right places, Jann Horn created a reproducer for a hard to hit UAF issue that became possible after VMAs were allowed to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
Race description is borrowed from Jann's discovery report: lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under rcu_read_lock(). At that point, the VMA may be concurrently freed, and it can be recycled by another process. vma_start_read() then increments the vma->vm_refcnt (if it is in an acceptable range), and if this succeeds, vma_start_read() can return a recycled VMA.
In this scenario where the VMA has been recycled, lock_vma_under_rcu() will then detect the mismatching ->vm_mm pointer and drop the VMA through vma_end_read(), which calls vma_refcount_put(). vma_refcount_put() drops the refcount and then calls rcuwait_wake_up() using a copy of vma->vm_mm. This is wrong: It implicitly assumes that the caller is keeping the VMA's mm alive, but in this scenario the caller has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
The diagram depicting the race: T1 T2 T3 == == == lock_vma_under_rcu mas_walk <VMA gets removed from mm> mmap <the same VMA is reallocated> vma_start_read __refcount_inc_not_zero_limited_acquire munmap __vma_enter_locked refcount_add_not_zero vma_end_read vma_refcount_put __refcount_dec_and_test rcuwait_wait_event <finish operation> rcuwait_wake_up [UAF]
Note that rcuwait_wait_event() in T3 does not block because refcount was already dropped by T1. At this point T3 can exit and free the mm causing UAF in T1. To avoid this we move vma->vm_mm verification into vma_start_read() and grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU") Reported-by: Jann Horn jannh@google.com Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m... Signed-off-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org
This LGTM AFAICT, so:
Acked-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
Thanks! I think I'll respin v3 to add a warning comment to vma_start_read(), so will address your nits there.
I'll fold a description of this check into the detailed impl notes I'm writing.
Changes since v1 [1]
- Made a copy of vma->mm before using it in vma_start_read(),
per Vlastimil Babka
Notes:
- Applies cleanly over mm-unstable.
- Should be applied to 6.15 and 6.16 but these branches do not
have lock_next_vma() function, so the change in lock_next_vma() should be skipped when applying to those branches.
[1] https://lore.kernel.org/all/20250728170950.2216966-1-surenb@google.com/
include/linux/mmap_lock.h | 23 +++++++++++++++++++++++ mm/mmap_lock.c | 10 +++------- 2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 1f4f44951abe..da34afa2f8ef 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w); #include <linux/tracepoint-defs.h> #include <linux/types.h> #include <linux/cleanup.h> +#include <linux/sched/mm.h>
#define MMAP_LOCK_INITIALIZER(name) \ .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock), @@ -183,6 +184,28 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm, }
rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
/*
* If vma got attached to another mm from under us, that mm is not
* stable and can be freed in the narrow window after vma->vm_refcnt
* is dropped and before rcuwait_wake_up(mm) is called. Grab it before
* releasing vma->vm_refcnt.
*/
if (unlikely(vma->vm_mm != mm)) {
/* Use a copy of vm_mm in case vma is freed after we drop vm_refcnt */
struct mm_struct *other_mm = vma->vm_mm;
NIT: Not sure if we should have a space before the comment below. But it doesn't matter... :)
/*
* __mmdrop() is a heavy operation and we don't need RCU
* protection here. Release RCU lock during these operations.
*/
NIT: Maybe worth saying 'we reinstate the RCU read lock as the caller expects it to be held when this functino returns even on error' or something like this.
rcu_read_unlock();
mmgrab(other_mm);
vma_refcount_put(vma);
mmdrop(other_mm);
rcu_read_lock();
return NULL;
}
/* * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result. * False unlocked result is impossible because we modify and check
diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 729fb7d0dd59..aa3bc42ecde0 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, */
/* Check if the vma we locked is the right one. */
if (unlikely(vma->vm_mm != mm ||
address < vma->vm_start || address >= vma->vm_end))
if (unlikely(address < vma->vm_start || address >= vma->vm_end)) goto inval_end_read; rcu_read_unlock();
@@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm, goto fallback; }
/*
* Verify the vma we locked belongs to the same address space and it's
* not behind of the last search position.
*/
if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
/* Verify the vma is not behind of the last search position. */
NIT: 'behind of' should be 'behind', 'behind of' is weird sounding here
if (unlikely(from_addr >= vma->vm_end)) goto fallback_unlock; /*
base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
2.50.1.487.gc89ff58d15-goog
linux-stable-mirror@lists.linaro.org