 
            On Fri, 11 Jun 2021 17:55:09 +0200 Bernd Edlinger bernd.edlinger@hotmail.de wrote:
This introduces signal->unsafe_execve_in_progress, which is used to fix the case when at least one of the sibling threads is traced, and therefore the trace process may dead-lock in ptrace_attach, but de_thread will need to wait for the tracer to continue execution.
The solution is to detect this situation and allow ptrace_attach to continue, while de_thread() is still waiting for traced zombies to be eventually released. When the current thread changed the ptrace status from non-traced to traced, we can simply abort the whole execve and restart it by returning -ERESTARTSYS. This needs to be done before changing the thread leader, because the PTRACE_EVENT_EXEC needs to know the old thread pid.
Although it is technically after the point of no return, we just have to reset bprm->point_of_no_return here, since at this time only the other threads have received a fatal signal, not the current thread.
From the user's point of view the whole execve was
simply delayed until after the ptrace_attach.
Other threads die quickly since the cred_guard_mutex is released, but a deadly signal is already pending. In case the mutex_lock_killable misses the signal, ->unsafe_execve_in_progress makes sure they release the mutex immediately and return with -ERESTARTNOINTR.
This means there is no API change, unlike the previous version of this patch which was discussed here:
https://lore.kernel.org/lkml/b6537ae6-31b1-5c50-f32b-8b8332ace882@hotmail.de...
See tools/testing/selftests/ptrace/vmaccess.c for a test case that gets fixed by this change.
Note that since the test case was originally designed to test the ptrace_attach returning an error in this situation, the test expectation needed to be adjusted, to allow the API to succeed at the first attempt.
err, sorry. I replied to the v8 patch, not to v9.
--- a/fs/exec.c~exec-fix-dead-lock-in-de_thread-with-ptrace_attach-v9 +++ a/fs/exec.c @@ -1056,29 +1056,31 @@ static int de_thread(struct task_struct return -EAGAIN; }
- while_each_thread(tsk, t) { - if (unlikely(t->ptrace) && t != tsk->group_leader) - sig->unsafe_execve_in_progress = true; - } - sig->group_exit_task = tsk; sig->notify_count = zap_other_threads(tsk); if (!thread_group_leader(tsk)) sig->notify_count--; - spin_unlock_irq(lock);
- if (unlikely(sig->unsafe_execve_in_progress)) + while_each_thread(tsk, t) { + if (unlikely(t->ptrace) && t != tsk->group_leader) + sig->unsafe_execve_in_progress = true; + } + + if (unlikely(sig->unsafe_execve_in_progress)) { + spin_unlock_irq(lock); mutex_unlock(&sig->cred_guard_mutex); + spin_lock_irq(lock); + }
- for (;;) { - set_current_state(TASK_KILLABLE); - if (!sig->notify_count) - break; + while (sig->notify_count) { + __set_current_state(TASK_KILLABLE); + spin_unlock_irq(lock); schedule(); if (__fatal_signal_pending(tsk)) goto killed; + spin_lock_irq(lock); } - __set_current_state(TASK_RUNNING); + spin_unlock_irq(lock);
if (unlikely(sig->unsafe_execve_in_progress)) { if (mutex_lock_killable(&sig->cred_guard_mutex)) _