Antgroup is using 5.10.y in product environment, we found several patches are missing in 5.10.y tree. These patches are needed for us. So we backported them to 5.10.y. Also backport to 5.15.y and 6.1.y to prevent regression.
Jiachen Zhang (1): fuse: always revalidate rename target dentry
Miklos Szeredi (2): fuse: fix attr version comparison in fuse_read_update_size() fuse: fix deadlock between atomic O_TRUNC and page invalidation
fs/fuse/dir.c | 9 +++++++-- fs/fuse/file.c | 31 ++++++++++++++++++------------- 2 files changed, 25 insertions(+), 15 deletions(-)
From: Miklos Szeredi mszeredi@redhat.com
commit 484ce65715b06aead8c4901f01ca32c5a240bc71 upstream.
[backport for 5.15.y]
A READ request returning a short count is taken as indication of EOF, and the cached file size is modified accordingly.
Fix the attribute version checking to allow for changes to fc->attr_version on other inodes.
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Bo yb203166@antfin.com --- fs/fuse/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 2b19d281351e..ab994ecdaf38 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -793,7 +793,7 @@ static void fuse_read_update_size(struct inode *inode, loff_t size, struct fuse_inode *fi = get_fuse_inode(inode);
spin_lock(&fi->lock); - if (attr_ver == fi->attr_version && size < inode->i_size && + if (attr_ver >= fi->attr_version && size < inode->i_size && !test_bit(FUSE_I_SIZE_UNSTABLE, &fi->state)) { fi->attr_version = atomic64_inc_return(&fc->attr_version); i_size_write(inode, size);
From: Jiachen Zhang zhangjiachen.jaycee@bytedance.com
commit ccc031e26afe60d2a5a3d93dabd9c978210825fb upstream.
[backport for 5.15.y]
The previous commit df8629af2934 ("fuse: always revalidate if exclusive create") ensures that the dentries are revalidated on O_EXCL creates. This commit complements it by also performing revalidation for rename target dentries. Otherwise, a rename target file that only exists in kernel dentry cache but not in the filesystem will result in EEXIST if RENAME_NOREPLACE flag is used.
Signed-off-by: Jiachen Zhang zhangjiachen.jaycee@bytedance.com Signed-off-by: Zhang Tianci zhangtianci.1997@bytedance.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Bo yb203166@antfin.com --- fs/fuse/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 80a2181b402b..075266140fac 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -205,7 +205,7 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags) if (inode && fuse_is_bad(inode)) goto invalid; else if (time_before64(fuse_dentry_time(entry), get_jiffies_64()) || - (flags & (LOOKUP_EXCL | LOOKUP_REVAL))) { + (flags & (LOOKUP_EXCL | LOOKUP_REVAL | LOOKUP_RENAME_TARGET))) { struct fuse_entry_out outarg; FUSE_ARGS(args); struct fuse_forget_link *forget;
From: Miklos Szeredi mszeredi@redhat.com
commit 2fdbb8dd01556e1501132b5ad3826e8f71e24a8b upstream.
[backport for 5.15.y]
fuse_finish_open() will be called with FUSE_NOWRITE set in case of atomic O_TRUNC open(), so commit 76224355db75 ("fuse: truncate pagecache on atomic_o_trunc") replaced invalidate_inode_pages2() by truncate_pagecache() in such a case to avoid the A-A deadlock. However, we found another A-B-B-A deadlock related to the case above, which will cause the xfstests generic/464 testcase hung in our virtio-fs test environment.
For example, consider two processes concurrently open one same file, one with O_TRUNC and another without O_TRUNC. The deadlock case is described below, if open(O_TRUNC) is already set_nowrite(acquired A), and is trying to lock a page (acquiring B), open() could have held the page lock (acquired B), and waiting on the page writeback (acquiring A). This would lead to deadlocks.
open(O_TRUNC) ---------------------------------------------------------------- fuse_open_common inode_lock [C acquire] fuse_set_nowrite [A acquire]
fuse_finish_open truncate_pagecache lock_page [B acquire] truncate_inode_page unlock_page [B release]
fuse_release_nowrite [A release] inode_unlock [C release] ----------------------------------------------------------------
open() ---------------------------------------------------------------- fuse_open_common fuse_finish_open invalidate_inode_pages2 lock_page [B acquire] fuse_launder_page fuse_wait_on_page_writeback [A acquire & release] unlock_page [B release] ----------------------------------------------------------------
Besides this case, all calls of invalidate_inode_pages2() and invalidate_inode_pages2_range() in fuse code also can deadlock with open(O_TRUNC).
Fix by moving the truncate_pagecache() call outside the nowrite protected region. The nowrite protection is only for delayed writeback (writeback_cache) case, where inode lock does not protect against truncation racing with writes on the server. Write syscalls racing with page cache truncation still get the inode lock protection.
This patch also changes the order of filemap_invalidate_lock() vs. fuse_set_nowrite() in fuse_open_common(). This new order matches the order found in fuse_file_fallocate() and fuse_do_setattr().
Reported-by: Jiachen Zhang zhangjiachen.jaycee@bytedance.com Tested-by: Jiachen Zhang zhangjiachen.jaycee@bytedance.com Fixes: e4648309b85a ("fuse: truncate pending writes on O_TRUNC") Cc: stable@vger.kernel.org Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Yang Bo yb203166@antfin.com --- fs/fuse/dir.c | 7 ++++++- fs/fuse/file.c | 29 +++++++++++++++++------------ 2 files changed, 23 insertions(+), 13 deletions(-)
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 075266140fac..1abbdd78389a 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -476,6 +476,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry, struct fuse_entry_out outentry; struct fuse_inode *fi; struct fuse_file *ff; + bool trunc = flags & O_TRUNC;
/* Userspace expects S_IFREG in create mode */ BUG_ON((mode & S_IFMT) != S_IFREG); @@ -500,7 +501,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry, inarg.mode = mode; inarg.umask = current_umask();
- if (fm->fc->handle_killpriv_v2 && (flags & O_TRUNC) && + if (fm->fc->handle_killpriv_v2 && trunc && !(flags & O_EXCL) && !capable(CAP_FSETID)) { inarg.open_flags |= FUSE_OPEN_KILL_SUIDGID; } @@ -549,6 +550,10 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry, } else { file->private_data = ff; fuse_finish_open(inode, file); + if (fm->fc->atomic_o_trunc && trunc) + truncate_pagecache(inode, 0); + else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) + invalidate_inode_pages2(inode->i_mapping); } return err;
diff --git a/fs/fuse/file.c b/fs/fuse/file.c index ab994ecdaf38..2c4cac6104c9 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -210,12 +210,9 @@ void fuse_finish_open(struct inode *inode, struct file *file) fi->attr_version = atomic64_inc_return(&fc->attr_version); i_size_write(inode, 0); spin_unlock(&fi->lock); - truncate_pagecache(inode, 0); fuse_invalidate_attr(inode); if (fc->writeback_cache) file_update_time(file); - } else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) { - invalidate_inode_pages2(inode->i_mapping); }
if ((file->f_mode & FMODE_WRITE) && fc->writeback_cache) @@ -240,30 +237,38 @@ int fuse_open_common(struct inode *inode, struct file *file, bool isdir) if (err) return err;
- if (is_wb_truncate || dax_truncate) { + if (is_wb_truncate || dax_truncate) inode_lock(inode); - fuse_set_nowrite(inode); - }
if (dax_truncate) { filemap_invalidate_lock(inode->i_mapping); err = fuse_dax_break_layouts(inode, 0, 0); if (err) - goto out; + goto out_inode_unlock; }
+ if (is_wb_truncate || dax_truncate) + fuse_set_nowrite(inode); + err = fuse_do_open(fm, get_node_id(inode), file, isdir); if (!err) fuse_finish_open(inode, file);
-out: + if (is_wb_truncate || dax_truncate) + fuse_release_nowrite(inode); + if (!err) { + struct fuse_file *ff = file->private_data; + + if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC)) + truncate_pagecache(inode, 0); + else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) + invalidate_inode_pages2(inode->i_mapping); + } if (dax_truncate) filemap_invalidate_unlock(inode->i_mapping); - - if (is_wb_truncate | dax_truncate) { - fuse_release_nowrite(inode); +out_inode_unlock: + if (is_wb_truncate || dax_truncate) inode_unlock(inode); - }
return err; }
linux-stable-mirror@lists.linaro.org