On Tue, Aug 20, 2024 at 12:34:14PM GMT, Eric Biggers wrote:
On Mon, Aug 19, 2024 at 10:41:15AM +0200, Christian Brauner wrote:
On Sat, Aug 17, 2024 at 08:58:18PM GMT, Eric Biggers wrote:
Hi Christian,
On Wed, Jul 31, 2024 at 12:01:12PM +0200, Christian Brauner wrote:
It's currently possible to create pidfds for kthreads but it is unclear what that is supposed to mean. Until we have use-cases for it and we figured out what behavior we want block the creation of pidfds for kthreads.
Fixes: 32fcb426ec00 ("pid: add pidfd_open()") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner brauner@kernel.org
kernel/fork.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-)
Unfortunately this commit broke systemd-shutdown's ability to kill processes, which makes some filesystems no longer get unmounted at shutdown.
It looks like systemd-shutdown relies on being able to create a pidfd for any process listed in /proc (even a kthread), and if it gets EINVAL it treats it a fatal error and stops looking for more processes...
Thanks for the report! I talked to Daan De Meyer who made that change and he said that this must a systemd version that hasn't gotten his fixes yet. In any case, if this causes regression then I'll revert it right now. See the appended revert.
Thanks for queueing up a revert.
This was on systemd 256.4 which was released less than a month ago.
I'm not sure what systemd fix you are talking about. Looking at killall() in src/shared/killall.c on the latest "main" branch of systemd, it calls proc_dir_read_pidref() => pidref_set_pid() => pidfd_open(), and EINVAL gets passed back up to killall() and treated as a fatal error. ignore_proc() skips kernel threads but is executed too late. I didn't test it, so I could be wrong, but based on the code it does not appear to be fixed.
Yeah, I think you're right. What they fixed is ead48ec35c86 ("cgroup-util: Don't try to open pidfd for kernel threads") when reading pids from cgroup.procs. Daan is currently prepping a fix for reading pids from /proc as well.