css_task_iter_next() pins and returns a task, but the task can do whatever between that and cgroup_procs_show() being called, including dying and losing its PID. When that happens, task_pid_vnr() returns 0.
d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out") makes this more likely as tasks now stay iterable with css_task_iter_next() until the last schedule is complete, which can be after the task has lost its PID.
Showing "0" in cgroup.procs or cgroup.threads is confusing and can lead to surprising outcomes. For example, if a user tries to kill PID 0, it kills all processes in the current process group.
Skip entries with PID 0 by returning SEQ_SKIP.
Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo tj@kernel.org --- kernel/cgroup/cgroup.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -5287,6 +5287,17 @@ static void *cgroup_procs_start(struct s
static int cgroup_procs_show(struct seq_file *s, void *v) { + pid_t pid = task_pid_vnr(v); + + /* + * css_task_iter_next() could have visited a task which has already lost + * its PID but is not dead yet or the task could have been unhashed + * since css_task_iter_next(). In such cases, $pid would be 0 here. + * Don't confuse userspace with it. + */ + if (unlikely(!pid)) + return SEQ_SKIP; + seq_printf(s, "%d\n", task_pid_vnr(v)); return 0; }
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#opti...
Rule: The upstream commit ID must be specified with a separate line above the commit text. Subject: [PATCH cgroup/for-6.18-fixes] cgroup: Skip showing PID 0 in cgroup.procs and cgroup.threads Link: https://lore.kernel.org/stable/2016aece61b4da7ad86c6eca2dbcfd16%40kernel.org
Please ignore this mail if the patch is not relevant for upstream.
Hello.
On Thu, Nov 06, 2025 at 12:07:45PM -1000, Tejun Heo tj@kernel.org wrote:
css_task_iter_next() pins and returns a task, but the task can do whatever between that and cgroup_procs_show() being called, including dying and losing its PID. When that happens, task_pid_vnr() returns 0.
task_pid_vnr() would return 0 also when the process is not from reader's pidns (IMO more common than the transitional effect).
Showing "0" in cgroup.procs or cgroup.threads is confusing and can lead to surprising outcomes. For example, if a user tries to kill PID 0, it kills all processes in the current process group.
It's still info about present processes.
Skip entries with PID 0 by returning SEQ_SKIP.
It's likely OK to skip for these exiting tasks but with the external pidns tasks in mind, reading cgroup.procs now may give false impression of an empty cgroup.
Where does the 0 from of the exiting come from? (Could it be distinguished from foreign pidns?)
Thanks, Michal
Hello,
On Fri, Nov 07, 2025 at 10:57:54AM +0100, Michal Koutný wrote:
On Thu, Nov 06, 2025 at 12:07:45PM -1000, Tejun Heo tj@kernel.org wrote:
css_task_iter_next() pins and returns a task, but the task can do whatever between that and cgroup_procs_show() being called, including dying and losing its PID. When that happens, task_pid_vnr() returns 0.
task_pid_vnr() would return 0 also when the process is not from reader's pidns (IMO more common than the transitional effect).
Hmm... haven't thought about that.
Showing "0" in cgroup.procs or cgroup.threads is confusing and can lead to surprising outcomes. For example, if a user tries to kill PID 0, it kills all processes in the current process group.
It's still info about present processes.
Skip entries with PID 0 by returning SEQ_SKIP.
It's likely OK to skip for these exiting tasks but with the external pidns tasks in mind, reading cgroup.procs now may give false impression of an empty cgroup.
Where does the 0 from of the exiting come from? (Could it be distinguished from foreign pidns?)
Yeah, I think it can be distinguished. We just need to check whether the task has pid attached at all after getting 0 return from task_pid_vnr().
Thanks.
linux-stable-mirror@lists.linaro.org