On Tue, May 21, 2019 at 02:09:29PM +0200, Florian Weimer wrote:
- Christian Brauner:
+/**
- __close_range() - Close all file descriptors in a given range.
- @fd: starting file descriptor to close
- @max_fd: last file descriptor to close
- This closes a range of file descriptors. All file descriptors
- from @fd up to and including @max_fd are closed.
- */
+int __close_range(struct files_struct *files, unsigned fd, unsigned max_fd) +{
- unsigned int cur_max;
- if (fd > max_fd)
return -EINVAL;
- rcu_read_lock();
- cur_max = files_fdtable(files)->max_fds;
- rcu_read_unlock();
- /* cap to last valid index into fdtable */
- if (max_fd >= cur_max)
max_fd = cur_max - 1;
- while (fd <= max_fd)
__close_fd(files, fd++);
- return 0;
+}
This seems rather drastic. How long does this block in kernel mode? Maybe it's okay as long as the maximum possible value for cur_max stays around 4 million or so.
That's probably valid concern when you reach very high numbers though I wonder how relevant this is in practice. Also, you would only be blocking yourself I imagine, i.e. you can't DOS another task with this unless your multi-threaded.
Solaris has an fdwalk function:
https://docs.oracle.com/cd/E88353_01/html/E37843/closefrom-3c.html
So a different way to implement this would expose a nextfd system call
Meh. If nextfd() then I would like it to be able to: - get the nextfd(fd) >= fd - get highest open fd e.g. nextfd(-1)
But then I wonder if nextfd() needs to be a syscall and isn't just either: fcntl(fd, F_GET_NEXT)? or prctl(PR_GET_NEXT)?
Technically, one could also do:
fd_range(unsigned fd, unsigend end_fd, unsigned flags);
fd_range(3, 50, FD_RANGE_CLOSE);
/* return highest fd within the range [3, 50] */ fd_range(3, 50, FD_RANGE_NEXT);
/* return highest fd */ fd_range(3, UINT_MAX, FD_RANGE_NEXT);
This syscall could also reasonably be extended.
to userspace, so that we can use that to implement both fdwalk and closefrom. But maybe fdwalk is just too obscure, given the existence of /proc.
Yeah we probably don't need fdwalk.
I'll happily implement closefrom on top of close_range in glibc (plus fallback for older kernels based on /proc—with an abort in case that doesn't work because the RLIMIT_NOFILE hack is unreliable unfortunately).
Thanks, Florian