Re: tmpfs hang after v6.12-rc6

15 Nov 2024

On Sat, Nov 16, 2024 at 01:58:17AM +0900, Jeongjun Park wrote:
...
Chuck Lever chuck.lever@oracle.com wrote:
...
On Sat, Nov 16, 2024 at 01:33:19AM +0900, Jeongjun Park wrote:
...
2024년 11월 16일 (토) 오전 1:25, Chuck Lever chuck.lever@oracle.com님이 작성:
...
On Fri, Nov 15, 2024 at 11:04:56AM -0500, Chuck Lever wrote:
...
I've found that NFS access to an exported tmpfs file system hangs
indefinitely when the client first performs a GETATTR. The hanging
nfsd thread is waiting for the inode lock in shmem_getattr():
task:nfsd            state:D stack:0     pid:1775  tgid:1775  ppid:2      flags:0x00004000
Call Trace:
 <TASK>
 __schedule+0x770/0x7b0
 schedule+0x33/0x50
 schedule_preempt_disabled+0x19/0x30
 rwsem_down_read_slowpath+0x206/0x230
 down_read+0x3f/0x60
 shmem_getattr+0x84/0xf0
 vfs_getattr_nosec+0x9e/0xc0
 vfs_getattr+0x49/0x50
 fh_getattr+0x43/0x50 [nfsd]
 fh_fill_pre_attrs+0x4e/0xd0 [nfsd]
 nfsd4_open+0x51f/0x910 [nfsd]
 nfsd4_proc_compound+0x492/0x5d0 [nfsd]
 nfsd_dispatch+0x117/0x1f0 [nfsd]
 svc_process_common+0x3b2/0x5e0 [sunrpc]
 ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
 svc_process+0xcf/0x130 [sunrpc]
 svc_recv+0x64e/0x750 [sunrpc]
 ? __wake_up_bit+0x4b/0x60
 ? __pfx_nfsd+0x10/0x10 [nfsd]
 nfsd+0xc6/0xf0 [nfsd]
 kthread+0xed/0x100
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x2e/0x50
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30
 </TASK>
I bisected the problem to:
d949d1d14fa281ace388b1de978e8f2cd52875cf is the first bad commit
commit d949d1d14fa281ace388b1de978e8f2cd52875cf
Author:     Jeongjun Park aha310510@gmail.com
AuthorDate: Mon Sep 9 21:35:58 2024 +0900
Commit:     Andrew Morton akpm@linux-foundation.org
CommitDate: Mon Oct 28 21:40:39 2024 -0700
mm: shmem: fix data-race in shmem_getattr()


...
Link: https://lkml.kernel.org/r/20240909123558.70229-1-aha310510@gmail.com
Fixes: 44a30220bc0a ("shmem: recalculate file inode when fstat")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Reported-by: syzbot <syzkaller@googlegroup.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


which first appeared in v6.12-rc6, and adds the line that is waiting
on the inode lock when my NFS server hangs.
I haven't yet found the process that is holding the inode lock for
this inode.
It is likely that the caller (nfsd4_open()-> fh_fill_pre_attrs()) is
already holding the inode semaphore in this case.
Thanks for letting me know!
It seems that the previous patch I wrote was wrong in how to prevent data-race.
It seems that the problem occurs in nfsd because nfsd4_create_file() already
holds the inode_lock.
After further analysis, I found that this data-race mainly occurs when
vfs_statx_path does not acquire the inode_lock, and in other filesystems,
it is confirmed that inode_lock is acquired in many cases, so I will send a
new patch that fixes this problem right away.
Thanks for your quick response!
My brief sample of file system ->getattr methods shows that these
functions do not grab the inode semaphore at all when calling
generic_fillattr(). Likely they expect the method's caller to take
it.
I strongly prefer to see this commit reverted for v6.12-rc first,
and then the new fix should be merged via a normal merge window to
permit a lengthy period of testing.
Hmm... Of course, revert this patch is not a bad idea, but I think that
patching it like below can effectively prevent data-race without causing
recursive locking:

mm/shmem.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index e87f5d6799a7..d061f8b34d49 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1153,6 +1153,12 @@ static int shmem_getattr(struct mnt_idmap *idmap,
 {
    struct inode *inode = path->dentry->d_inode;
    struct shmem_inode_info *info = SHMEM_I(inode);

bool inode_locked = NULL;


if (!inode_is_locked(inode)) {

  inode_lock_shared(inode);


  inode_locked = true;


}
if (info->alloced - info->swapped != inode->i_mapping->nrpages)
    shmem_recalc_inode(inode, 0, 0);


@@ -1166,9 +1172,7 @@ static int shmem_getattr(struct mnt_idmap *idmap,
    stat->attributes_mask |= (STATX_ATTR_APPEND |
            STATX_ATTR_IMMUTABLE |
            STATX_ATTR_NODUMP);

inode_lock_shared(inode);
generic_fillattr(idmap, request_mask, inode, stat);

inode_unlock_shared(inode);
if (shmem_huge_global_enabled(inode, 0, 0, false, NULL, 0))
    stat->blksize = HPAGE_PMD_SIZE;


@@ -1179,6 +1183,9 @@ static int shmem_getattr(struct mnt_idmap *idmap,
        stat->btime.tv_nsec = info->i_crtime.tv_nsec;
    }

if (inode_locked)
  inode_unlock_shared(inode);


return 0;

}
--
What do you think?
As I said before, I've failed to find any file system getattr method
that explicitly takes the inode's semaphore around a
generic_fillattr() call. My understanding is that these methods
assume that their caller handles appropriate serialization.
Therefore, taking the inode semaphore at all in shmem_getattr()
seems to me to be the wrong approach entirely.
The point of reverting immediately is that any fix can't possibly
get the review and testing it deserves three days before v6.12
becomes final. Also, it's not clear what the rush to fix the
KCSAN splat is; according to the Fixes: tag, this issue has been
present for years without causing overt problems. It doesn't feel
like this change belongs in an -rc in the first place.
Please revert d949d1d14fa2, then let's discuss a proper fix in a
separate thread. Thanks!
...
Regards,
Jeongjun Park
...
...
...
...
Because this commit addresses only a KCSAN splat that has been
present since v4.3, and does not address a reported behavioral
issue, I respectfully request that this commit be reverted
immediately so that it does not appear in v6.12 final.
Troubleshooting and testing should continue until a fix to the KCSAN
issue can be found that does not deadlock NFS exports of tmpfs.
--
Chuck Lever
--
Chuck Lever
--
Chuck Lever
-- 
Chuck Lever


    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: tmpfs hang after v6.12-rc6