On 05/16, Christian Brauner wrote:
With the introduction of pidfds through CLONE_PIDFD it is possible to created pidfds at process creation time.
Now I am wondering why do we need CLONE_PIDFD, you can just do
pid = fork(); pidfd_open(pid);
+SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) +{
- int fd, ret;
- struct pid *p;
- struct task_struct *tsk;
- if (flags)
return -EINVAL;
- if (pid <= 0)
return -EINVAL;
- p = find_get_pid(pid);
- if (!p)
return -ESRCH;
- ret = 0;
- rcu_read_lock();
- /*
* If this returns non-NULL the pid was used as a thread-group
* leader. Note, we race with exec here: If it changes the
* thread-group leader we might return the old leader.
*/
- tsk = pid_task(p, PIDTYPE_TGID);
- if (!tsk)
ret = -ESRCH;
- rcu_read_unlock();
- fd = ret ?: pidfd_create(p);
- put_pid(p);
- return fd;
+}
Looks correct, feel free to add Reviewed-by: Oleg Nesterov oleg@redhat.com
But why do we need task_struct *tsk?
rcu_read_lock(); if (!pid_task(PIDTYPE_TGID)) ret = -ESRCH; rcu_read_unlock();
and in fact we do not even need rcu_read_lock(), we could do
// shut up rcu_dereference_check() rcu_lock_acquire(&rcu_lock_map); if (!pid_task(PIDTYPE_TGID)) ret = -ESRCH; rcu_lock_release(&rcu_lock_map);
Well... I won't insist, but the comment about the race with exec looks a bit confusing to me. It is true, but we do not care at all, we are not going to use the task_struct returned by pid_task().
Oleg.