css_task_iter_next() pins and returns a task, but the task can do whatever between that and cgroup_procs_show() being called, including dying and losing its PID. When that happens, task_pid_vnr() returns 0.
d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out") makes this more likely as tasks now stay iterable with css_task_iter_next() until the last schedule is complete, which can be after the task has lost its PID.
Showing "0" in cgroup.procs or cgroup.threads is confusing and can lead to surprising outcomes. For example, if a user tries to kill PID 0, it kills all processes in the current process group.
Skip entries with PID 0 by returning SEQ_SKIP.
Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo tj@kernel.org --- kernel/cgroup/cgroup.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -5287,6 +5287,17 @@ static void *cgroup_procs_start(struct s
static int cgroup_procs_show(struct seq_file *s, void *v) { + pid_t pid = task_pid_vnr(v); + + /* + * css_task_iter_next() could have visited a task which has already lost + * its PID but is not dead yet or the task could have been unhashed + * since css_task_iter_next(). In such cases, $pid would be 0 here. + * Don't confuse userspace with it. + */ + if (unlikely(!pid)) + return SEQ_SKIP; + seq_printf(s, "%d\n", task_pid_vnr(v)); return 0; }