On 01/17, Bernd Edlinger wrote:
The problem happens when a tracer tries to ptrace_attach to a multi-threaded process, that does an execve in one of the threads at the same time, without doing that in a forked sub-process. That means: There is a race condition, when one or more of the threads are already ptraced, but the thread that invoked the execve is not yet traced. Now in this case the execve locks the cred_guard_mutex and waits for de_thread to complete. But that waits for the traced sibling threads to exit, and those have to wait for the tracer to receive the exit signal, but the tracer cannot call wait right now, because it is waiting for the ptrace call to complete, and this never does not happen. The traced process and the tracer are now in a deadlock situation, and can only be killed by a fatal signal.
This looks very confusing to me. And even misleading.
So IIRC the problem is "simple".
de_thread() sleeps with cred_guard_mutex waiting for other threads to exit and pass release_task/__exit_signal.
If one of the sub-threads is traced, debugger should do ptrace_detach() or wait() to release this tracee, the killed tracee won't autoreap.
Yes. but the tracer has to do its job, and that is ptrace_attach the remaining treads, it does not know that it would avoid a dead-lock when it calls wait(), instead of ptrace_attach. It does not know that the tracee has just called execve in one of the not yet traced threads.
Hmm. I don't understand you.
I agree we have a problem which should be fixed. Just the changelog looks confusing to me, imo it doesn't explain the race/problem clearly.
Now. If debugger tries to take the same cred_guard_mutex before detach/wait we have a deadlock. This is not specific to ptrace_attach(), proc_pid_attr_write() takes this lock too.
Right? Or are there other issues?
No, proc_pid_attr_write has no problem if it waits for cred_guard_mutex, because it is only called from one of the sibling threads,
OK, thanks, I was wrong. I forgot about "A task may only write its own attributes". So yes, ptrace_attach() is the only source of problematic mutex_lock() today. There were more in the past.
if (unlikely(t->ptrace)
&& (t != tsk->group_leader || !t->exit_state))
unsafe_execve_in_progress = true;
The !t->exit_state is not right... This sub-thread can already be a zombie with ->exit_state != 0 but see above, it won't be reaped until the debugger does wait().
I dont think so. de_thread() handles the group_leader different than normal threads.
I don't follow...
I didn't say that t is a group leader. I said it can be a zombie sub-thread with ->exit_state != 0.
That means normal threads have to wait for being released from the zombie state by the tracer: sig->notify_count > 0, and de_thread is woken up by __exit_signal
That is what I said before. Debugger should release a zombie sub-thread, it won't do __exit_signal() on its own.
- if (unlikely(unsafe_execve_in_progress)) {
spin_unlock_irq(lock);
sig->exec_bprm = bprm;
mutex_unlock(&sig->cred_guard_mutex);
spin_lock_irq(lock);
I don't understand why do we need to unlock and lock siglock here...
That is just a precaution because I did want to release the mutexes exactly in the reverse order as they were acquired.
To me this adds the unnecessary complication.
But my main question is why do we need the unsafe_execve_in_progress boolean. If this patch is correct and de_thread() can drop and re-acquire cread_guard_mutex when one of the threads is traced, then why can't we do this unconditionally ?
I just wanted to keep the impact of the change as small as possible,
But the unsafe_execve_in_progress logic increases the impact and complicates the patch.
I think the fix should be as simple as possible. (to be honest, right now I don't think this is a right approach).
including possible performance degradation due to double checking of credentials.
Not sure I understand, but you can add the performance improvements later. Not to mention that this should be justified, and the for_other_threads() loop added by this patch into de_thread() is not nice performance-wise.
Oleg.