6.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleg Nesterov oleg@redhat.com
commit a15f37a40145c986cdf289a4b88390f35efdecc4 upstream.
The usage of task_lock(tsk->group_leader) in sys_prlimit64()->do_prlimit() path is very broken.
sys_prlimit64() does get_task_struct(tsk) but this only protects task_struct itself. If tsk != current and tsk is not a leader, this process can exit/exec and task_lock(tsk->group_leader) may use the already freed task_struct.
Another problem is that sys_prlimit64() can race with mt-exec which changes ->group_leader. In this case do_prlimit() may take the wrong lock, or (worse) ->group_leader may change between task_lock() and task_unlock().
Change sys_prlimit64() to take tasklist_lock when necessary. This is not nice, but I don't see a better fix for -stable.
Link: https://lkml.kernel.org/r/20250915120917.GA27702@redhat.com Fixes: 18c91bb2d872 ("prlimit: do not grab the tasklist_lock") Signed-off-by: Oleg Nesterov oleg@redhat.com Cc: Christian Brauner brauner@kernel.org Cc: Jiri Slaby jirislaby@kernel.org Cc: Mateusz Guzik mjguzik@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sys.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
--- a/kernel/sys.c +++ b/kernel/sys.c @@ -1734,6 +1734,7 @@ SYSCALL_DEFINE4(prlimit64, pid_t, pid, u struct rlimit old, new; struct task_struct *tsk; unsigned int checkflags = 0; + bool need_tasklist; int ret;
if (old_rlim) @@ -1760,8 +1761,25 @@ SYSCALL_DEFINE4(prlimit64, pid_t, pid, u get_task_struct(tsk); rcu_read_unlock();
- ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL, - old_rlim ? &old : NULL); + need_tasklist = !same_thread_group(tsk, current); + if (need_tasklist) { + /* + * Ensure we can't race with group exit or de_thread(), + * so tsk->group_leader can't be freed or changed until + * read_unlock(tasklist_lock) below. + */ + read_lock(&tasklist_lock); + if (!pid_alive(tsk)) + ret = -ESRCH; + } + + if (!ret) { + ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL, + old_rlim ? &old : NULL); + } + + if (need_tasklist) + read_unlock(&tasklist_lock);
if (!ret && old_rlim) { rlim_to_rlim64(&old, &old64);