From: Linus Torvalds torvalds@linux-foundation.org
commit 797964253d358cf8d705614dda394dbe30120223 upstream.
In commit 20ea1e7d13c1 ("file: always lock position for FMODE_ATOMIC_POS") we ended up always taking the file pos lock, because pidfd_getfd() could get a reference to the file even when it didn't have an elevated file count due to threading of other sharing cases.
But Mateusz Guzik reports that the extra locking is actually measurable, so let's re-introduce the optimization, and only force the locking for directory traversal.
Directories need the lock for correctness reasons, while regular files only need it for "POSIX semantics". Since pidfd_getfd() is about debuggers etc special things that are _way_ outside of POSIX, we can relax the rules for that case.
Reported-by: Mateusz Guzik mjguzik@gmail.com Cc: Christian Brauner brauner@kernel.org Link: https://lore.kernel.org/linux-fsdevel/20230803095311.ijpvhx3fyrbkasul@f/ Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/file.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
--- a/fs/file.c +++ b/fs/file.c @@ -1036,12 +1036,28 @@ unsigned long __fdget_raw(unsigned int f return __fget_light(fd, 0); }
+/* + * Try to avoid f_pos locking. We only need it if the + * file is marked for FMODE_ATOMIC_POS, and it can be + * accessed multiple ways. + * + * Always do it for directories, because pidfd_getfd() + * can make a file accessible even if it otherwise would + * not be, and for directories this is a correctness + * issue, not a "POSIX requirement". + */ +static inline bool file_needs_f_pos_lock(struct file *file) +{ + return (file->f_mode & FMODE_ATOMIC_POS) && + (file_count(file) > 1 || S_ISDIR(file_inode(file)->i_mode)); +} + unsigned long __fdget_pos(unsigned int fd) { unsigned long v = __fdget(fd); struct file *file = (struct file *)(v & ~3);
- if (file && (file->f_mode & FMODE_ATOMIC_POS)) { + if (file && file_needs_f_pos_lock(file)) { v |= FDPUT_POS_UNLOCK; mutex_lock(&file->f_pos_lock); }