On Mon, Jul 24, 2023 at 09:51:05AM -0700, Linus Torvalds wrote:
On Mon, 24 Jul 2023 at 09:36, Linus Torvalds torvalds@linux-foundation.org wrote:
There are magic rules with "total_refs == inflight_refs", and that total_refs thing is very much the file count, ie
total_refs = file_count(u->sk.sk_socket->file);
where we had some nasty bugs with files coming back to life.
Ok, I don't think this is an issue here. It really is that "only in-flight refs remaining" that is a special case, and even pidfd_getfd() shouldn't be able to change that.
But the magic code is all in fget_task(), and those need to be checked.
You can see how proc does things properly: it does do "fget_task()", but then it only uses it to copy the path part, and just does fput() afterwards.
The bpf code does something like that too, and seems ok (ie it gets the file in order to copy data from it, not to install it).
Aside of fget_task() use, it has this: rcu_read_lock(); for (;; curr_fd++) { struct file *f; f = task_lookup_next_fd_rcu(curr_task, &curr_fd); if (!f) break; if (!get_file_rcu(f)) continue;
/* set info->fd */ info->fd = curr_fd; rcu_read_unlock(); return f; }
curr_task is not cached current here - it can be an arbitrary thread. And what we do to the file reference we get here is
ctx.meta = &meta; ctx.task = info->task; ctx.fd = info->fd; ctx.file = file; return bpf_iter_run_prog(prog, &ctx);
I think it can't be used to shove it into any descriptor table, but then there's forming an SCM_RIGHTS datagram and sending it, etc. - it might be worth looking into.