Hello,
Lack of proper validation that cached inodes are free during allocation can, cause a crash in fs/xfs/xfs_icache.c (refer: CVE-2018-13093). To address this issue, I'm backporting upstream commit [1] to 4.4 and 4.9 stable trees (a backport of [1] to 4.14 already exists).
Also, commit [1] references another commit [2] which added checks only to xfs_iget_cache_miss(). In this patch, those checks have been moved into a dedicated checker method and both xfs_iget_cache_miss() and xfs_iget_cache_hit() are made to call that method. This code reorg in commit [1], makes commit [2] redundant in the history of the 4.9 and 4.4 stable trees. So commit [2] is not being backported.
-- Sid
[1]: afca6c5b2595 ("xfs: validate cached inodes are free when allocated") [2]: ee457001ed6c ("xfs: catch inode allocation state mismatch corruption")
change log: v2: - Reword cover letter. - Fix accidental worong patch that got mailed.
From: Dave Chinner dchinner@redhat.com
commit afca6c5b2595fc44383919fba740c194b0b76aff upstream.
A recent fuzzed filesystem image cached random dcache corruption when the reproducer was run. This often showed up as panics in lookup_slow() on a null inode->i_ops pointer when doing pathwalks.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 .... Call Trace: lookup_slow+0x44/0x60 walk_component+0x3dd/0x9f0 link_path_walk+0x4a7/0x830 path_lookupat+0xc1/0x470 filename_lookup+0x129/0x270 user_path_at_empty+0x36/0x40 path_listxattr+0x98/0x110 SyS_listxattr+0x13/0x20 do_syscall_64+0xf5/0x280 entry_SYSCALL_64_after_hwframe+0x42/0xb7
but had many different failure modes including deadlocks trying to lock the inode that was just allocated or KASAN reports of use-after-free violations.
The cause of the problem was a corrupt INOBT on a v4 fs where the root inode was marked as free in the inobt record. Hence when we allocated an inode, it chose the root inode to allocate, found it in the cache and re-initialised it.
We recently fixed a similar inode allocation issue caused by inobt record corruption problem in xfs_iget_cache_miss() in commit ee457001ed6c ("xfs: catch inode allocation state mismatch corruption"). This change adds similar checks to the cache-hit path to catch it, and turns the reproducer into a corruption shutdown situation.
Reported-by: Wen Xu wen.xu@gatech.edu Signed-Off-By: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Carlos Maiolino cmaiolino@redhat.com Reviewed-by: Darrick J. Wong darrick.wong@oracle.com [darrick: fix typos in comment] Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Siddharth Chandrasekaran csiddharth@vmware.com --- fs/xfs/xfs_icache.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 50 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index adbc1f5..7efeefb 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -135,6 +135,46 @@ xfs_inode_free( }
/* + * If we are allocating a new inode, then check what was returned is + * actually a free, empty inode. If we are not allocating an inode, + * then check we didn't find a free inode. + * + * Returns: + * 0 if the inode free state matches the lookup context + * -ENOENT if the inode is free and we are not allocating + * -EFSCORRUPTED if there is any state mismatch at all + */ +static int +xfs_iget_check_free_state( + struct xfs_inode *ip, + int flags) +{ + if (flags & XFS_IGET_CREATE) { + /* should be a free inode */ + if (VFS_I(ip)->i_mode != 0) { + xfs_warn(ip->i_mount, +"Corruption detected! Free inode 0x%llx not marked free! (mode 0x%x)", + ip->i_ino, VFS_I(ip)->i_mode); + return -EFSCORRUPTED; + } + + if (ip->i_d.di_nblocks != 0) { + xfs_warn(ip->i_mount, +"Corruption detected! Free inode 0x%llx has blocks allocated!", + ip->i_ino); + return -EFSCORRUPTED; + } + return 0; + } + + /* should be an allocated inode */ + if (VFS_I(ip)->i_mode == 0) + return -ENOENT; + + return 0; +} + +/* * Check the validity of the inode we just found it the cache */ static int @@ -183,12 +223,12 @@ xfs_iget_cache_hit( }
/* - * If lookup is racing with unlink return an error immediately. + * Check the inode free state is valid. This also detects lookup + * racing with unlinks. */ - if (ip->i_d.di_mode == 0 && !(flags & XFS_IGET_CREATE)) { - error = -ENOENT; + error = xfs_iget_check_free_state(ip, flags); + if (error) goto out_error; - }
/* * If IRECLAIMABLE is set, we've torn down the VFS inode already. @@ -298,10 +338,13 @@ xfs_iget_cache_miss(
trace_xfs_iget_miss(ip);
- if ((ip->i_d.di_mode == 0) && !(flags & XFS_IGET_CREATE)) { - error = -ENOENT; + /* + * Check the inode free state is valid. This also detects lookup + * racing with unlinks. + */ + error = xfs_iget_check_free_state(ip, flags); + if (error) goto out_destroy; - }
/* * Preload the radix tree so we can insert safely under the
From: Dave Chinner dchinner@redhat.com
commit afca6c5b2595fc44383919fba740c194b0b76aff upstream.
A recent fuzzed filesystem image cached random dcache corruption when the reproducer was run. This often showed up as panics in lookup_slow() on a null inode->i_ops pointer when doing pathwalks.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 .... Call Trace: lookup_slow+0x44/0x60 walk_component+0x3dd/0x9f0 link_path_walk+0x4a7/0x830 path_lookupat+0xc1/0x470 filename_lookup+0x129/0x270 user_path_at_empty+0x36/0x40 path_listxattr+0x98/0x110 SyS_listxattr+0x13/0x20 do_syscall_64+0xf5/0x280 entry_SYSCALL_64_after_hwframe+0x42/0xb7
but had many different failure modes including deadlocks trying to lock the inode that was just allocated or KASAN reports of use-after-free violations.
The cause of the problem was a corrupt INOBT on a v4 fs where the root inode was marked as free in the inobt record. Hence when we allocated an inode, it chose the root inode to allocate, found it in the cache and re-initialised it.
We recently fixed a similar inode allocation issue caused by inobt record corruption problem in xfs_iget_cache_miss() in commit ee457001ed6c ("xfs: catch inode allocation state mismatch corruption"). This change adds similar checks to the cache-hit path to catch it, and turns the reproducer into a corruption shutdown situation.
Reported-by: Wen Xu wen.xu@gatech.edu Signed-Off-By: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Carlos Maiolino cmaiolino@redhat.com Reviewed-by: Darrick J. Wong darrick.wong@oracle.com [darrick: fix typos in comment] Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Siddharth Chandrasekaran csiddharth@vmware.com --- fs/xfs/xfs_icache.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 50 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 86a4911..d668d30 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -308,6 +308,46 @@ xfs_reinit_inode( }
/* + * If we are allocating a new inode, then check what was returned is + * actually a free, empty inode. If we are not allocating an inode, + * then check we didn't find a free inode. + * + * Returns: + * 0 if the inode free state matches the lookup context + * -ENOENT if the inode is free and we are not allocating + * -EFSCORRUPTED if there is any state mismatch at all + */ +static int +xfs_iget_check_free_state( + struct xfs_inode *ip, + int flags) +{ + if (flags & XFS_IGET_CREATE) { + /* should be a free inode */ + if (VFS_I(ip)->i_mode != 0) { + xfs_warn(ip->i_mount, +"Corruption detected! Free inode 0x%llx not marked free! (mode 0x%x)", + ip->i_ino, VFS_I(ip)->i_mode); + return -EFSCORRUPTED; + } + + if (ip->i_d.di_nblocks != 0) { + xfs_warn(ip->i_mount, +"Corruption detected! Free inode 0x%llx has blocks allocated!", + ip->i_ino); + return -EFSCORRUPTED; + } + return 0; + } + + /* should be an allocated inode */ + if (VFS_I(ip)->i_mode == 0) + return -ENOENT; + + return 0; +} + +/* * Check the validity of the inode we just found it the cache */ static int @@ -356,12 +396,12 @@ xfs_iget_cache_hit( }
/* - * If lookup is racing with unlink return an error immediately. + * Check the inode free state is valid. This also detects lookup + * racing with unlinks. */ - if (VFS_I(ip)->i_mode == 0 && !(flags & XFS_IGET_CREATE)) { - error = -ENOENT; + error = xfs_iget_check_free_state(ip, flags); + if (error) goto out_error; - }
/* * If IRECLAIMABLE is set, we've torn down the VFS inode already. @@ -471,10 +511,13 @@ xfs_iget_cache_miss(
trace_xfs_iget_miss(ip);
- if ((VFS_I(ip)->i_mode == 0) && !(flags & XFS_IGET_CREATE)) { - error = -ENOENT; + /* + * Check the inode free state is valid. This also detects lookup + * racing with unlinks. + */ + error = xfs_iget_check_free_state(ip, flags); + if (error) goto out_destroy; - }
/* * Preload the radix tree so we can insert safely under the
On Fri, May 15, 2020 at 08:41:07PM +0530, Siddharth Chandrasekaran wrote:
Hello,
Lack of proper validation that cached inodes are free during allocation can, cause a crash in fs/xfs/xfs_icache.c (refer: CVE-2018-13093). To address this issue, I'm backporting upstream commit [1] to 4.4 and 4.9 stable trees (a backport of [1] to 4.14 already exists).
Also, commit [1] references another commit [2] which added checks only to xfs_iget_cache_miss(). In this patch, those checks have been moved into a dedicated checker method and both xfs_iget_cache_miss() and xfs_iget_cache_hit() are made to call that method. This code reorg in commit [1], makes commit [2] redundant in the history of the 4.9 and 4.4 stable trees. So commit [2] is not being backported.
-- Sid
change log: v2:
- Reword cover letter.
- Fix accidental worong patch that got mailed.
As the XFS maintainers want to see xfstests pass with any changes made, have you done so for the 4.9 and 4.4 trees with this patch applied?
thanks,
greg k-h
On Fri, May 15, 2020 at 05:22:30PM +0200, Greg KH wrote:
On Fri, May 15, 2020 at 08:41:07PM +0530, Siddharth Chandrasekaran wrote:
Hello,
Lack of proper validation that cached inodes are free during allocation can, cause a crash in fs/xfs/xfs_icache.c (refer: CVE-2018-13093). To address this issue, I'm backporting upstream commit [1] to 4.4 and 4.9 stable trees (a backport of [1] to 4.14 already exists).
Also, commit [1] references another commit [2] which added checks only to xfs_iget_cache_miss(). In this patch, those checks have been moved into a dedicated checker method and both xfs_iget_cache_miss() and xfs_iget_cache_hit() are made to call that method. This code reorg in commit [1], makes commit [2] redundant in the history of the 4.9 and 4.4 stable trees. So commit [2] is not being backported.
-- Sid
change log: v2:
- Reword cover letter.
- Fix accidental worong patch that got mailed.
As the XFS maintainers want to see xfstests pass with any changes made, have you done so for the 4.9 and 4.4 trees with this patch applied?
I haven't run them yet. I'll do so and get back with the results shortly.
-- Sid.
On Fri, May 15, 2020 at 09:28:38PM +0530, Siddharth Chandrasekaran wrote:
On Fri, May 15, 2020 at 05:22:30PM +0200, Greg KH wrote:
On Fri, May 15, 2020 at 08:41:07PM +0530, Siddharth Chandrasekaran wrote:
Hello,
Lack of proper validation that cached inodes are free during allocation can, cause a crash in fs/xfs/xfs_icache.c (refer: CVE-2018-13093). To address this issue, I'm backporting upstream commit [1] to 4.4 and 4.9 stable trees (a backport of [1] to 4.14 already exists).
Also, commit [1] references another commit [2] which added checks only to xfs_iget_cache_miss(). In this patch, those checks have been moved into a dedicated checker method and both xfs_iget_cache_miss() and xfs_iget_cache_hit() are made to call that method. This code reorg in commit [1], makes commit [2] redundant in the history of the 4.9 and 4.4 stable trees. So commit [2] is not being backported.
-- Sid
change log: v2:
- Reword cover letter.
- Fix accidental worong patch that got mailed.
As the XFS maintainers want to see xfstests pass with any changes made, have you done so for the 4.9 and 4.4 trees with this patch applied?
I haven't run them yet. I'll do so and get back with the results shortly.
Hi Greg,
I am having some issue setting up my xfstests testing environment. On a Ubuntu 20.04 LTS VM, I installed 4.9.223 kernel with this patch applied. Then cloned xfstests-dev repository from [1] and setup the test environment as explained in the top-level README file. After this, I did the following:
- Added a new disk (/dev/sdb1) and created 2 partitions of (64 GB each). - Formatted /dev/sdb1 to xfs and dropped a few kernel source tarballs into to it. - Copied local.config.example to local.config and modified it as: export TEST_DEV=/dev/sdb1 export TEST_DIR=/mnt/t0 export SCRATCH_DEV=/dev/sdb2 export SCRATCH_MNT=/mnt/scratch - Executed: sudo ./check -g all
When executing the tests, I observed multiple failures. In addition to test failures, the testing script just froze after executing some some test cases (more frequently test 269) when trying to perform a mount or umount operation on either of the newly added partitions.
So I presumed the patch was buggy and reverted the change to re try the test. Interestingly, that too failed and produced similar results. dmesg is filled with xfs errors, with the most frequent being:
XFS (dm-0): metadata I/O error: block 0x3 ("xfs_trans_read_buf_map") error 5 numblks 1
obviously, I must be doing something wrong; I can try to dig deeper figure it out myself but wanted to check with you first, if you can spot something obviously wrong in what I'm doing.
Thanks!
-- Sid.
On Wed, May 20, 2020 at 02:48:00PM +0530, Siddharth Chandrasekaran wrote:
On Fri, May 15, 2020 at 09:28:38PM +0530, Siddharth Chandrasekaran wrote:
On Fri, May 15, 2020 at 05:22:30PM +0200, Greg KH wrote:
On Fri, May 15, 2020 at 08:41:07PM +0530, Siddharth Chandrasekaran wrote:
Hello,
Lack of proper validation that cached inodes are free during allocation can, cause a crash in fs/xfs/xfs_icache.c (refer: CVE-2018-13093). To address this issue, I'm backporting upstream commit [1] to 4.4 and 4.9 stable trees (a backport of [1] to 4.14 already exists).
Also, commit [1] references another commit [2] which added checks only to xfs_iget_cache_miss(). In this patch, those checks have been moved into a dedicated checker method and both xfs_iget_cache_miss() and xfs_iget_cache_hit() are made to call that method. This code reorg in commit [1], makes commit [2] redundant in the history of the 4.9 and 4.4 stable trees. So commit [2] is not being backported.
-- Sid
change log: v2:
- Reword cover letter.
- Fix accidental worong patch that got mailed.
As the XFS maintainers want to see xfstests pass with any changes made, have you done so for the 4.9 and 4.4 trees with this patch applied?
I haven't run them yet. I'll do so and get back with the results shortly.
Hi Greg,
I am having some issue setting up my xfstests testing environment. On a Ubuntu 20.04 LTS VM, I installed 4.9.223 kernel with this patch applied. Then cloned xfstests-dev repository from [1] and setup the test environment as explained in the top-level README file. After this, I did the following:
- Added a new disk (/dev/sdb1) and created 2 partitions of (64 GB each).
- Formatted /dev/sdb1 to xfs and dropped a few kernel source tarballs into to it.
- Copied local.config.example to local.config and modified it as: export TEST_DEV=/dev/sdb1 export TEST_DIR=/mnt/t0 export SCRATCH_DEV=/dev/sdb2 export SCRATCH_MNT=/mnt/scratch
- Executed: sudo ./check -g all
When executing the tests, I observed multiple failures. In addition to test failures, the testing script just froze after executing some some test cases (more frequently test 269) when trying to perform a mount or
There are multiple "test 269"s. generic/269? xfs/269?
umount operation on either of the newly added partitions.
So I presumed the patch was buggy and reverted the change to re try the test. Interestingly, that too failed and produced similar results. dmesg is filled with xfs errors, with the most frequent being:
XFS (dm-0): metadata I/O error: block 0x3 ("xfs_trans_read_buf_map") error 5 numblks 1
obviously, I must be doing something wrong; I can try to dig deeper figure it out myself but wanted to check with you first, if you can spot something obviously wrong in what I'm doing.
Dirty open secret of fstests: it's all A/B testing, where "A" involves running fstests until you've established a baseline of which tests actually pass on the base of your (4.4 and 4.9) kernels, and which tests to exclude (e.g. "test 269") because they crash the kernel.
(Please do ask the xfs list about the ones that crash the kernel.)
You might also run the crashing tests on a recent linus release on the off chance that the test is one of the ones that have been slowly fixed over the intervening years but are too unwieldly/infrequently-hit of a patchset to have been auto-backported to stable.
(The shutdown tests like xfs/057, generic/388, and generic/475 come to mind here.)
Sorry for the slow response, I'm rather overwhelmed these days...
--D
Thanks!
-- Sid.
linux-stable-mirror@lists.linaro.org