Returning to focus on 6.1, here is the 6.1 set from the corresponding 6.6 set:
https://lore.kernel.org/all/20240208232054.15778-1-catherine.hoang@oracle.co...
Two patches are missing from the original set: [01/21] MAINTAINERS: add Catherine as xfs maintainer for 6.6.y 6.6.y-only change [16/21] xfs: fix again select in kconfig XFS_ONLINE_SCRUB_STATS XFS_ONLINE_SCRUB_STATS didn't show up till 6.6
The auto group was run on 10 configs and no regressions were seen. This has been ack'd on the xfs-stable mailing list.
Thanks, Leah
Catherine Hoang (1): xfs: allow read IO and FICLONE to run concurrently
Cheng Lin (1): xfs: introduce protection for drop nlink
Christoph Hellwig (4): xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space xfs: only remap the written blocks in xfs_reflink_end_cow_extent xfs: clean up FS_XFLAG_REALTIME handling in xfs_ioctl_setattr_xflags xfs: respect the stable writes flag on the RT device
Darrick J. Wong (8): xfs: bump max fsgeom struct version xfs: hoist freeing of rt data fork extent mappings xfs: prevent rt growfs when quota is enabled xfs: rt stubs should return negative errnos when rt disabled xfs: fix units conversion error in xfs_bmap_del_extent_delay xfs: make sure maxlen is still congruent with prod when rounding down xfs: clean up dqblk extraction xfs: dquot recovery does not validate the recovered dquot
Dave Chinner (1): xfs: inode recovery does not validate the recovered inode
Leah Rumancik (1): xfs: up(ic_sema) if flushing data device fails
Long Li (2): xfs: factor out xfs_defer_pending_abort xfs: abort intent items when recovery intents fail
Omar Sandoval (1): xfs: fix internal error from AGFL exhaustion
fs/xfs/libxfs/xfs_alloc.c | 27 ++++++++++++-- fs/xfs/libxfs/xfs_bmap.c | 21 +++-------- fs/xfs/libxfs/xfs_defer.c | 28 +++++++++------ fs/xfs/libxfs/xfs_defer.h | 2 +- fs/xfs/libxfs/xfs_inode_buf.c | 3 ++ fs/xfs/libxfs/xfs_rtbitmap.c | 33 +++++++++++++++++ fs/xfs/libxfs/xfs_sb.h | 2 +- fs/xfs/xfs_bmap_util.c | 24 +++++++------ fs/xfs/xfs_dquot.c | 5 +-- fs/xfs/xfs_dquot_item_recover.c | 21 +++++++++-- fs/xfs/xfs_file.c | 63 ++++++++++++++++++++++++++------- fs/xfs/xfs_inode.c | 24 +++++++++++++ fs/xfs/xfs_inode.h | 17 +++++++++ fs/xfs/xfs_inode_item_recover.c | 14 +++++++- fs/xfs/xfs_ioctl.c | 30 ++++++++++------ fs/xfs/xfs_iops.c | 7 ++++ fs/xfs/xfs_log.c | 23 ++++++------ fs/xfs/xfs_log_recover.c | 2 +- fs/xfs/xfs_reflink.c | 5 +++ fs/xfs/xfs_rtalloc.c | 33 +++++++++++++---- fs/xfs/xfs_rtalloc.h | 27 ++++++++------ 21 files changed, 310 insertions(+), 101 deletions(-)
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 9488062805943c2d63350d3ef9e4dc093799789a ]
The latest version of the fs geometry structure is v5. Bump this constant so that xfs_db and mkfs calls to libxfs_fs_geometry will fill out all the fields.
IOWs, this commit is a no-op for the kernel, but will be useful for userspace reporting in later changes.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/libxfs/xfs_sb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_sb.h b/fs/xfs/libxfs/xfs_sb.h index a5e14740ec9a..19134b23c10b 100644 --- a/fs/xfs/libxfs/xfs_sb.h +++ b/fs/xfs/libxfs/xfs_sb.h @@ -23,11 +23,11 @@ extern void xfs_sb_quota_from_disk(struct xfs_sb *sbp); extern bool xfs_sb_good_version(struct xfs_sb *sbp); extern uint64_t xfs_sb_version_to_features(struct xfs_sb *sbp);
extern int xfs_update_secondary_sbs(struct xfs_mount *mp);
-#define XFS_FS_GEOM_MAX_STRUCT_VER (4) +#define XFS_FS_GEOM_MAX_STRUCT_VER (5) extern void xfs_fs_geometry(struct xfs_mount *mp, struct xfs_fsop_geom *geo, int struct_version); extern int xfs_sb_read_secondary(struct xfs_mount *mp, struct xfs_trans *tp, xfs_agnumber_t agno, struct xfs_buf **bpp);
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 9488062805943c2d63350d3ef9e4dc093799789a
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 195f22386e19) 6.1.y | Present (different SHA1: 593486dfe122)
Note: The patch differs from the upstream commit: --- 1: 9488062805943 ! 1: ba4444929ed01 xfs: bump max fsgeom struct version @@ Metadata ## Commit message ## xfs: bump max fsgeom struct version
+ [ Upstream commit 9488062805943c2d63350d3ef9e4dc093799789a ] + The latest version of the fs geometry structure is v5. Bump this constant so that xfs_db and mkfs calls to libxfs_fs_geometry will fill out all the fields. @@ Commit message
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/libxfs/xfs_sb.h ## @@ fs/xfs/libxfs/xfs_sb.h: extern uint64_t xfs_sb_version_to_features(struct xfs_sb *sbp); ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 6c664484337b37fa0cf6e958f4019623e30d40f7 ]
Currently, xfs_bmap_del_extent_real contains a bunch of code to convert the physical extent of a data fork mapping for a realtime file into rt extents and pass that to the rt extent freeing function. Since the details of this aren't needed when CONFIG_XFS_REALTIME=n, move it to xfs_rtbitmap.c to reduce code size when realtime isn't enabled.
This will (one day) enable realtime EFIs to reuse the same unit-converting call with less code duplication.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/libxfs/xfs_bmap.c | 19 +++---------------- fs/xfs/libxfs/xfs_rtbitmap.c | 33 +++++++++++++++++++++++++++++++++ fs/xfs/xfs_rtalloc.h | 5 +++++ 3 files changed, 41 insertions(+), 16 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 9dc33cdc2ab9..d45a2e681f93 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -5035,37 +5035,24 @@ xfs_bmap_del_extent_real( del->br_startoff > got.br_startoff && del_endoff < got_endoff) return -ENOSPC;
flags = XFS_ILOG_CORE; if (whichfork == XFS_DATA_FORK && XFS_IS_REALTIME_INODE(ip)) { - xfs_filblks_t len; - xfs_extlen_t mod; - - len = div_u64_rem(del->br_blockcount, mp->m_sb.sb_rextsize, - &mod); - ASSERT(mod == 0); - if (!(bflags & XFS_BMAPI_REMAP)) { - xfs_fsblock_t bno; - - bno = div_u64_rem(del->br_startblock, - mp->m_sb.sb_rextsize, &mod); - ASSERT(mod == 0); - - error = xfs_rtfree_extent(tp, bno, (xfs_extlen_t)len); + error = xfs_rtfree_blocks(tp, del->br_startblock, + del->br_blockcount); if (error) goto done; }
do_fx = 0; - nblks = len * mp->m_sb.sb_rextsize; qfield = XFS_TRANS_DQ_RTBCOUNT; } else { do_fx = 1; - nblks = del->br_blockcount; qfield = XFS_TRANS_DQ_BCOUNT; } + nblks = del->br_blockcount;
del_endblock = del->br_startblock + del->br_blockcount; if (cur) { error = xfs_bmbt_lookup_eq(cur, &got, &i); if (error) diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c index fa180ab66b73..655108a4cd05 100644 --- a/fs/xfs/libxfs/xfs_rtbitmap.c +++ b/fs/xfs/libxfs/xfs_rtbitmap.c @@ -1003,10 +1003,43 @@ xfs_rtfree_extent( xfs_trans_log_inode(tp, mp->m_rbmip, XFS_ILOG_CORE); } return 0; }
+/* + * Free some blocks in the realtime subvolume. rtbno and rtlen are in units of + * rt blocks, not rt extents; must be aligned to the rt extent size; and rtlen + * cannot exceed XFS_MAX_BMBT_EXTLEN. + */ +int +xfs_rtfree_blocks( + struct xfs_trans *tp, + xfs_fsblock_t rtbno, + xfs_filblks_t rtlen) +{ + struct xfs_mount *mp = tp->t_mountp; + xfs_rtblock_t bno; + xfs_filblks_t len; + xfs_extlen_t mod; + + ASSERT(rtlen <= XFS_MAX_BMBT_EXTLEN); + + len = div_u64_rem(rtlen, mp->m_sb.sb_rextsize, &mod); + if (mod) { + ASSERT(mod == 0); + return -EIO; + } + + bno = div_u64_rem(rtbno, mp->m_sb.sb_rextsize, &mod); + if (mod) { + ASSERT(mod == 0); + return -EIO; + } + + return xfs_rtfree_extent(tp, bno, len); +} + /* Find all the free records within a given range. */ int xfs_rtalloc_query_range( struct xfs_mount *mp, struct xfs_trans *tp, diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h index 62c7ad79cbb6..3b2f1b499a11 100644 --- a/fs/xfs/xfs_rtalloc.h +++ b/fs/xfs/xfs_rtalloc.h @@ -56,10 +56,14 @@ int /* error */ xfs_rtfree_extent( struct xfs_trans *tp, /* transaction pointer */ xfs_rtblock_t bno, /* starting block number to free */ xfs_extlen_t len); /* length of extent freed */
+/* Same as above, but in units of rt blocks. */ +int xfs_rtfree_blocks(struct xfs_trans *tp, xfs_fsblock_t rtbno, + xfs_filblks_t rtlen); + /* * Initialize realtime fields in the mount structure. */ int /* error */ xfs_rtmount_init( @@ -137,10 +141,11 @@ int xfs_rtalloc_extent_is_free(struct xfs_mount *mp, struct xfs_trans *tp, bool *is_free); int xfs_rtalloc_reinit_frextents(struct xfs_mount *mp); #else # define xfs_rtallocate_extent(t,b,min,max,l,f,p,rb) (ENOSYS) # define xfs_rtfree_extent(t,b,l) (ENOSYS) +# define xfs_rtfree_blocks(t,rb,rl) (ENOSYS) # define xfs_rtpick_extent(m,t,l,rb) (ENOSYS) # define xfs_growfs_rt(mp,in) (ENOSYS) # define xfs_rtalloc_query_range(t,l,h,f,p) (ENOSYS) # define xfs_rtalloc_query_all(m,t,f,p) (ENOSYS) # define xfs_rtbuf_get(m,t,b,i,p) (ENOSYS)
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 6c664484337b37fa0cf6e958f4019623e30d40f7
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: e820b13ba866) 6.1.y | Present (different SHA1: 24a3929ec784)
Note: The patch differs from the upstream commit: --- 1: 6c664484337b3 ! 1: 1b742f230fc2e xfs: hoist freeing of rt data fork extent mappings @@ Metadata ## Commit message ## xfs: hoist freeing of rt data fork extent mappings
+ [ Upstream commit 6c664484337b37fa0cf6e958f4019623e30d40f7 ] + Currently, xfs_bmap_del_extent_real contains a bunch of code to convert the physical extent of a data fork mapping for a realtime file into rt extents and pass that to the rt extent freeing function. Since the @@ Commit message
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/libxfs/xfs_bmap.c ## @@ fs/xfs/libxfs/xfs_bmap.c: xfs_bmap_del_extent_real( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit b73494fa9a304ab95b59f07845e8d7d36e4d23e0 ]
Quotas aren't (yet) supported with realtime, so we shouldn't allow userspace to set up a realtime section when quotas are enabled, even if they attached one via mount options. IOWS, you shouldn't be able to do:
# mkfs.xfs -f /dev/sda # mount /dev/sda /mnt -o rtdev=/dev/sdb,usrquota # xfs_growfs -r /mnt
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_rtalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 292d5e54a92c..34980d7c2dd6 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -952,11 +952,11 @@ xfs_growfs_rt( if (XFS_FSB_TO_B(mp, in->extsize) > XFS_MAX_RTEXTSIZE || XFS_FSB_TO_B(mp, in->extsize) < XFS_MIN_RTEXTSIZE) return -EINVAL;
/* Unsupported realtime features. */ - if (xfs_has_rmapbt(mp) || xfs_has_reflink(mp)) + if (xfs_has_rmapbt(mp) || xfs_has_reflink(mp) || xfs_has_quota(mp)) return -EOPNOTSUPP;
nrblocks = in->newblocks; error = xfs_sb_validate_fsb_count(sbp, nrblocks); if (error)
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: b73494fa9a304ab95b59f07845e8d7d36e4d23e0
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 6a6bb41b31df) 6.1.y | Present (different SHA1: a68e3ff6bba2)
Note: The patch differs from the upstream commit: --- 1: b73494fa9a304 ! 1: e2ad9605027dd xfs: prevent rt growfs when quota is enabled @@ Metadata ## Commit message ## xfs: prevent rt growfs when quota is enabled
+ [ Upstream commit b73494fa9a304ab95b59f07845e8d7d36e4d23e0 ] + Quotas aren't (yet) supported with realtime, so we shouldn't allow userspace to set up a realtime section when quotas are enabled, even if they attached one via mount options. IOWS, you shouldn't be able to do: @@ Commit message
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/xfs_rtalloc.c ## @@ fs/xfs/xfs_rtalloc.c: xfs_growfs_rt( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit c2988eb5cff75c02bc57e02c323154aa08f55b78 ]
When realtime support is not compiled into the kernel, these functions should return negative errnos, not positive errnos. While we're at it, fix a broken macro declaration.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_rtalloc.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h index 3b2f1b499a11..65c284e9d33e 100644 --- a/fs/xfs/xfs_rtalloc.h +++ b/fs/xfs/xfs_rtalloc.h @@ -139,31 +139,31 @@ bool xfs_verify_rtbno(struct xfs_mount *mp, xfs_rtblock_t rtbno); int xfs_rtalloc_extent_is_free(struct xfs_mount *mp, struct xfs_trans *tp, xfs_rtblock_t start, xfs_extlen_t len, bool *is_free); int xfs_rtalloc_reinit_frextents(struct xfs_mount *mp); #else -# define xfs_rtallocate_extent(t,b,min,max,l,f,p,rb) (ENOSYS) -# define xfs_rtfree_extent(t,b,l) (ENOSYS) -# define xfs_rtfree_blocks(t,rb,rl) (ENOSYS) -# define xfs_rtpick_extent(m,t,l,rb) (ENOSYS) -# define xfs_growfs_rt(mp,in) (ENOSYS) -# define xfs_rtalloc_query_range(t,l,h,f,p) (ENOSYS) -# define xfs_rtalloc_query_all(m,t,f,p) (ENOSYS) -# define xfs_rtbuf_get(m,t,b,i,p) (ENOSYS) -# define xfs_verify_rtbno(m, r) (false) -# define xfs_rtalloc_extent_is_free(m,t,s,l,i) (ENOSYS) -# define xfs_rtalloc_reinit_frextents(m) (0) +# define xfs_rtallocate_extent(t,b,min,max,l,f,p,rb) (-ENOSYS) +# define xfs_rtfree_extent(t,b,l) (-ENOSYS) +# define xfs_rtfree_blocks(t,rb,rl) (-ENOSYS) +# define xfs_rtpick_extent(m,t,l,rb) (-ENOSYS) +# define xfs_growfs_rt(mp,in) (-ENOSYS) +# define xfs_rtalloc_query_range(m,t,l,h,f,p) (-ENOSYS) +# define xfs_rtalloc_query_all(m,t,f,p) (-ENOSYS) +# define xfs_rtbuf_get(m,t,b,i,p) (-ENOSYS) +# define xfs_verify_rtbno(m, r) (false) +# define xfs_rtalloc_extent_is_free(m,t,s,l,i) (-ENOSYS) +# define xfs_rtalloc_reinit_frextents(m) (0) static inline int /* error */ xfs_rtmount_init( xfs_mount_t *mp) /* file system mount structure */ { if (mp->m_sb.sb_rblocks == 0) return 0;
xfs_warn(mp, "Not built with CONFIG_XFS_RT"); return -ENOSYS; } -# define xfs_rtmount_inodes(m) (((mp)->m_sb.sb_rblocks == 0)? 0 : (ENOSYS)) +# define xfs_rtmount_inodes(m) (((mp)->m_sb.sb_rblocks == 0)? 0 : (-ENOSYS)) # define xfs_rtunmount_inodes(m) #endif /* CONFIG_XFS_RT */
#endif /* __XFS_RTALLOC_H__ */
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: c2988eb5cff75c02bc57e02c323154aa08f55b78
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: fe327b8234d4) 6.1.y | Present (different SHA1: f81de59216c1)
Note: The patch differs from the upstream commit: --- Failed to apply patch cleanly, falling back to interdiff... ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit ddd98076d5c075c8a6c49d9e6e8ee12844137f23 ]
The unit conversions in this function do not make sense. First we convert a block count to bytes, then divide that bytes value by rextsize, which is in blocks, to get an rt extent count. You can't divide bytes by blocks to get a (possibly multiblock) extent value.
Fortunately nobody uses delalloc on the rt volume so this hasn't mattered.
Fixes: fa5c836ca8eb5 ("xfs: refactor xfs_bunmapi_cow") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/libxfs/xfs_bmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index d45a2e681f93..27d3121e6da9 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4805,11 +4805,11 @@ xfs_bmap_del_extent_delay( ASSERT(del->br_blockcount > 0); ASSERT(got->br_startoff <= del->br_startoff); ASSERT(got_endoff >= del_endoff);
if (isrt) { - uint64_t rtexts = XFS_FSB_TO_B(mp, del->br_blockcount); + uint64_t rtexts = del->br_blockcount;
do_div(rtexts, mp->m_sb.sb_rextsize); xfs_mod_frextents(mp, rtexts); }
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: ddd98076d5c075c8a6c49d9e6e8ee12844137f23
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: e3aca4536b6b) 6.1.y | Present (different SHA1: 0af8f29df730)
Note: The patch differs from the upstream commit: --- 1: ddd98076d5c07 ! 1: 0835e674aff6a xfs: fix units conversion error in xfs_bmap_del_extent_delay @@ Metadata ## Commit message ## xfs: fix units conversion error in xfs_bmap_del_extent_delay
+ [ Upstream commit ddd98076d5c075c8a6c49d9e6e8ee12844137f23 ] + The unit conversions in this function do not make sense. First we convert a block count to bytes, then divide that bytes value by rextsize, which is in blocks, to get an rt extent count. You can't @@ Commit message Fixes: fa5c836ca8eb5 ("xfs: refactor xfs_bunmapi_cow") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/libxfs/xfs_bmap.c ## @@ fs/xfs/libxfs/xfs_bmap.c: xfs_bmap_del_extent_delay( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit f6a2dae2a1f52ea23f649c02615d073beba4cc35 ]
In commit 2a6ca4baed62, we tried to fix an overflow problem in the realtime allocator that was caused by an overly large maxlen value causing xfs_rtcheck_range to run off the end of the realtime bitmap. Unfortunately, there is a subtle bug here -- maxlen (and minlen) both have to be aligned with @prod, but @prod can be larger than 1 if the user has set an extent size hint on the file, and that extent size hint is larger than the realtime extent size.
If the rt free space extents are not aligned to this file's extszhint because other files without extent size hints allocated space (or the number of rt extents is similarly not aligned), then it's possible that maxlen after clamping to sb_rextents will no longer be aligned to prod. The allocation will succeed just fine, but we still trip the assertion.
Fix the problem by reducing maxlen by any misalignment with prod. While we're at it, split the assertions into two so that we can tell which value had the bad alignment.
Fixes: 2a6ca4baed62 ("xfs: make sure the rt allocator doesn't run off the end") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_rtalloc.c | 31 ++++++++++++++++++++++++++----- 1 file changed, 26 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 34980d7c2dd6..0bfbbc1dd0da 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -209,10 +209,27 @@ xfs_rtallocate_range( */ error = xfs_rtmodify_range(mp, tp, start, len, 0); return error; }
+/* + * Make sure we don't run off the end of the rt volume. Be careful that + * adjusting maxlen downwards doesn't cause us to fail the alignment checks. + */ +static inline xfs_extlen_t +xfs_rtallocate_clamp_len( + struct xfs_mount *mp, + xfs_rtblock_t startrtx, + xfs_extlen_t rtxlen, + xfs_extlen_t prod) +{ + xfs_extlen_t ret; + + ret = min(mp->m_sb.sb_rextents, startrtx + rtxlen) - startrtx; + return rounddown(ret, prod); +} + /* * Attempt to allocate an extent minlen<=len<=maxlen starting from * bitmap block bbno. If we don't get maxlen then use prod to trim * the length, if given. Returns error; returns starting block in *rtblock. * The lengths are all in rtextents. @@ -246,11 +263,11 @@ xfs_rtallocate_extent_block( for (i = XFS_BLOCKTOBIT(mp, bbno), besti = -1, bestlen = 0, end = XFS_BLOCKTOBIT(mp, bbno + 1) - 1; i <= end; i++) { /* Make sure we don't scan off the end of the rt volume. */ - maxlen = min(mp->m_sb.sb_rextents, i + maxlen) - i; + maxlen = xfs_rtallocate_clamp_len(mp, i, maxlen, prod);
/* * See if there's a free extent of maxlen starting at i. * If it's not so then next will contain the first non-free. */ @@ -353,11 +370,12 @@ xfs_rtallocate_extent_exact( int error; /* error value */ xfs_extlen_t i; /* extent length trimmed due to prod */ int isfree; /* extent is free */ xfs_rtblock_t next; /* next block to try (dummy) */
- ASSERT(minlen % prod == 0 && maxlen % prod == 0); + ASSERT(minlen % prod == 0); + ASSERT(maxlen % prod == 0); /* * Check if the range in question (for maxlen) is free. */ error = xfs_rtcheck_range(mp, tp, bno, maxlen, 1, &next, &isfree); if (error) { @@ -436,20 +454,22 @@ xfs_rtallocate_extent_near( int j; /* secondary loop control */ int log2len; /* log2 of minlen */ xfs_rtblock_t n; /* next block to try */ xfs_rtblock_t r; /* result block */
- ASSERT(minlen % prod == 0 && maxlen % prod == 0); + ASSERT(minlen % prod == 0); + ASSERT(maxlen % prod == 0); + /* * If the block number given is off the end, silently set it to * the last block. */ if (bno >= mp->m_sb.sb_rextents) bno = mp->m_sb.sb_rextents - 1;
/* Make sure we don't run off the end of the rt volume. */ - maxlen = min(mp->m_sb.sb_rextents, bno + maxlen) - bno; + maxlen = xfs_rtallocate_clamp_len(mp, bno, maxlen, prod); if (maxlen < minlen) { *rtblock = NULLRTBLOCK; return 0; }
@@ -636,11 +656,12 @@ xfs_rtallocate_extent_size( int l; /* level number (loop control) */ xfs_rtblock_t n; /* next block to be tried */ xfs_rtblock_t r; /* result block number */ xfs_suminfo_t sum; /* summary information for extents */
- ASSERT(minlen % prod == 0 && maxlen % prod == 0); + ASSERT(minlen % prod == 0); + ASSERT(maxlen % prod == 0); ASSERT(maxlen != 0);
/* * Loop over all the levels starting with maxlen. * At each level, look at all the bitmap blocks, to see if there
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: f6a2dae2a1f52ea23f649c02615d073beba4cc35
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 0fbbfe5fbfbe) 6.1.y | Present (different SHA1: 5d1a85efae8f)
Note: The patch differs from the upstream commit: --- 1: f6a2dae2a1f52 ! 1: ce863045e9922 xfs: make sure maxlen is still congruent with prod when rounding down @@ Metadata ## Commit message ## xfs: make sure maxlen is still congruent with prod when rounding down
+ [ Upstream commit f6a2dae2a1f52ea23f649c02615d073beba4cc35 ] + In commit 2a6ca4baed62, we tried to fix an overflow problem in the realtime allocator that was caused by an overly large maxlen value causing xfs_rtcheck_range to run off the end of the realtime bitmap. @@ Commit message Fixes: 2a6ca4baed62 ("xfs: make sure the rt allocator doesn't run off the end") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/xfs_rtalloc.c ## @@ fs/xfs/xfs_rtalloc.c: xfs_rtallocate_range( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Cheng Lin cheng.lin130@zte.com.cn
[ Upstream commit 2b99e410b28f5a75ae417e6389e767c7745d6fce ]
When abnormal drop_nlink are detected on the inode, return error, to avoid corruption propagation.
Signed-off-by: Cheng Lin cheng.lin130@zte.com.cn Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_inode.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 909085269227..1d32823d5099 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -917,10 +917,17 @@ xfs_init_new_inode( static int /* error */ xfs_droplink( xfs_trans_t *tp, xfs_inode_t *ip) { + if (VFS_I(ip)->i_nlink == 0) { + xfs_alert(ip->i_mount, + "%s: Attempt to drop inode (%llu) with nlink zero.", + __func__, ip->i_ino); + return -EFSCORRUPTED; + } + xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
drop_nlink(VFS_I(ip)); xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 2b99e410b28f5a75ae417e6389e767c7745d6fce
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Cheng Lincheng.lin130@zte.com.cn
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 47b07e51d0c2) 6.1.y | Present (different SHA1: 85d34cba11ff)
Note: The patch differs from the upstream commit: --- 1: 2b99e410b28f5 ! 1: 9782c142297e3 xfs: introduce protection for drop nlink @@ Metadata ## Commit message ## xfs: introduce protection for drop nlink
+ [ Upstream commit 2b99e410b28f5a75ae417e6389e767c7745d6fce ] + When abnormal drop_nlink are detected on the inode, return error, to avoid corruption propagation.
Signed-off-by: Cheng Lin cheng.lin130@zte.com.cn Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/xfs_inode.c ## @@ fs/xfs/xfs_inode.c: xfs_droplink( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Christoph Hellwig hch@lst.de
[ Upstream commit 35dc55b9e80cb9ec4bcb969302000b002b2ed850 ]
If xfs_bmapi_write finds a delalloc extent at the requested range, it tries to convert the entire delalloc extent to a real allocation.
But if the allocator cannot find a single free extent large enough to cover the start block of the requested range, xfs_bmapi_write will return 0 but leave *nimaps set to 0.
In that case we simply need to keep looping with the same startoffset_fsb so that one of the following allocations will eventually reach the requested range.
Note that this could affect any caller of xfs_bmapi_write that covers an existing delayed allocation. As far as I can tell we do not have any other such caller, though - the regular writeback path uses xfs_bmapi_convert_delalloc to convert delayed allocations to real ones, and direct I/O invalidates the page cache first.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_bmap_util.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index ce8e17ab5434..468bb61a5e46 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -778,49 +778,47 @@ xfs_alloc_file_space( xfs_off_t offset, xfs_off_t len) { xfs_mount_t *mp = ip->i_mount; xfs_off_t count; - xfs_filblks_t allocated_fsb; xfs_filblks_t allocatesize_fsb; xfs_extlen_t extsz, temp; xfs_fileoff_t startoffset_fsb; xfs_fileoff_t endoffset_fsb; - int nimaps; int rt; xfs_trans_t *tp; xfs_bmbt_irec_t imaps[1], *imapp; int error;
trace_xfs_alloc_file_space(ip);
if (xfs_is_shutdown(mp)) return -EIO;
error = xfs_qm_dqattach(ip); if (error) return error;
if (len <= 0) return -EINVAL;
rt = XFS_IS_REALTIME_INODE(ip); extsz = xfs_get_extsz_hint(ip);
count = len; imapp = &imaps[0]; - nimaps = 1; startoffset_fsb = XFS_B_TO_FSBT(mp, offset); endoffset_fsb = XFS_B_TO_FSB(mp, offset + count); allocatesize_fsb = endoffset_fsb - startoffset_fsb;
/* * Allocate file space until done or until there is an error */ while (allocatesize_fsb && !error) { xfs_fileoff_t s, e; unsigned int dblocks, rblocks, resblks; + int nimaps = 1;
/* * Determine space reservations for data/realtime. */ if (unlikely(extsz)) { @@ -882,19 +880,23 @@ xfs_alloc_file_space( error = xfs_trans_commit(tp); xfs_iunlock(ip, XFS_ILOCK_EXCL); if (error) break;
- allocated_fsb = imapp->br_blockcount; - - if (nimaps == 0) { - error = -ENOSPC; - break; + /* + * If the allocator cannot find a single free extent large + * enough to cover the start block of the requested range, + * xfs_bmapi_write will return 0 but leave *nimaps set to 0. + * + * In that case we simply need to keep looping with the same + * startoffset_fsb so that one of the following allocations + * will eventually reach the requested range. + */ + if (nimaps) { + startoffset_fsb += imapp->br_blockcount; + allocatesize_fsb -= imapp->br_blockcount; } - - startoffset_fsb += allocated_fsb; - allocatesize_fsb -= allocated_fsb; }
return error;
error:
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 35dc55b9e80cb9ec4bcb969302000b002b2ed850
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Christoph Hellwighch@lst.de
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: d4eba134c509) 6.1.y | Present (different SHA1: d2c306421d9c)
Note: The patch differs from the upstream commit: --- 1: 35dc55b9e80cb ! 1: 1dac0e648f50e xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space @@ Metadata ## Commit message ## xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space
+ [ Upstream commit 35dc55b9e80cb9ec4bcb969302000b002b2ed850 ] + If xfs_bmapi_write finds a delalloc extent at the requested range, it tries to convert the entire delalloc extent to a real allocation.
@@ Commit message Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/xfs_bmap_util.c ## @@ fs/xfs/xfs_bmap_util.c: xfs_alloc_file_space( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Catherine Hoang catherine.hoang@oracle.com
[ Upstream commit 14a537983b228cb050ceca3a5b743d01315dc4aa ]
One of our VM cluster management products needs to snapshot KVM image files so that they can be restored in case of failure. Snapshotting is done by redirecting VM disk writes to a sidecar file and using reflink on the disk image, specifically the FICLONE ioctl as used by "cp --reflink". Reflink locks the source and destination files while it operates, which means that reads from the main vm disk image are blocked, causing the vm to stall. When an image file is heavily fragmented, the copy process could take several minutes. Some of the vm image files have 50-100 million extent records, and duplicating that much metadata locks the file for 30 minutes or more. Having activities suspended for such a long time in a cluster node could result in node eviction.
Clone operations and read IO do not change any data in the source file, so they should be able to run concurrently. Demote the exclusive locks taken by FICLONE to shared locks to allow reads while cloning. While a clone is in progress, writes will take the IOLOCK_EXCL, so they block until the clone completes.
Link: https://lore.kernel.org/linux-xfs/8911B94D-DD29-4D6E-B5BC-32EAF1866245@oracl... Signed-off-by: Catherine Hoang catherine.hoang@oracle.com Reviewed-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_file.c | 63 +++++++++++++++++++++++++++++++++++--------- fs/xfs/xfs_inode.c | 17 ++++++++++++ fs/xfs/xfs_inode.h | 9 +++++++ fs/xfs/xfs_reflink.c | 4 +++ 4 files changed, 80 insertions(+), 13 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 8de40cf63a5b..821cb86a83bd 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -212,10 +212,47 @@ xfs_ilock_iocb( }
return 0; }
+static int +xfs_ilock_iocb_for_write( + struct kiocb *iocb, + unsigned int *lock_mode) +{ + ssize_t ret; + struct xfs_inode *ip = XFS_I(file_inode(iocb->ki_filp)); + + ret = xfs_ilock_iocb(iocb, *lock_mode); + if (ret) + return ret; + + if (*lock_mode == XFS_IOLOCK_EXCL) + return 0; + if (!xfs_iflags_test(ip, XFS_IREMAPPING)) + return 0; + + xfs_iunlock(ip, *lock_mode); + *lock_mode = XFS_IOLOCK_EXCL; + return xfs_ilock_iocb(iocb, *lock_mode); +} + +static unsigned int +xfs_ilock_for_write_fault( + struct xfs_inode *ip) +{ + /* get a shared lock if no remapping in progress */ + xfs_ilock(ip, XFS_MMAPLOCK_SHARED); + if (!xfs_iflags_test(ip, XFS_IREMAPPING)) + return XFS_MMAPLOCK_SHARED; + + /* wait for remapping to complete */ + xfs_iunlock(ip, XFS_MMAPLOCK_SHARED); + xfs_ilock(ip, XFS_MMAPLOCK_EXCL); + return XFS_MMAPLOCK_EXCL; +} + STATIC ssize_t xfs_file_dio_read( struct kiocb *iocb, struct iov_iter *to) { @@ -521,11 +558,11 @@ xfs_file_dio_write_aligned( struct iov_iter *from) { unsigned int iolock = XFS_IOLOCK_SHARED; ssize_t ret;
- ret = xfs_ilock_iocb(iocb, iolock); + ret = xfs_ilock_iocb_for_write(iocb, &iolock); if (ret) return ret; ret = xfs_file_write_checks(iocb, from, &iolock); if (ret) goto out_unlock; @@ -588,11 +625,11 @@ xfs_file_dio_write_unaligned( retry_exclusive: iolock = XFS_IOLOCK_EXCL; flags = IOMAP_DIO_FORCE_WAIT; }
- ret = xfs_ilock_iocb(iocb, iolock); + ret = xfs_ilock_iocb_for_write(iocb, &iolock); if (ret) return ret;
/* * We can't properly handle unaligned direct I/O to reflink files yet, @@ -1156,11 +1193,11 @@ xfs_file_remap_range( goto out_unlock;
if (xfs_file_sync_writes(file_in) || xfs_file_sync_writes(file_out)) xfs_log_force_inode(dest); out_unlock: - xfs_iunlock2_io_mmap(src, dest); + xfs_iunlock2_remapping(src, dest); if (ret) trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_); /* * If the caller did not set CAN_SHORTEN, then it is not prepared to * handle partial results -- either the whole remap succeeds, or we @@ -1311,37 +1348,37 @@ __xfs_filemap_fault( bool write_fault) { struct inode *inode = file_inode(vmf->vma->vm_file); struct xfs_inode *ip = XFS_I(inode); vm_fault_t ret; + unsigned int lock_mode = 0;
trace_xfs_filemap_fault(ip, pe_size, write_fault);
if (write_fault) { sb_start_pagefault(inode->i_sb); file_update_time(vmf->vma->vm_file); }
+ if (IS_DAX(inode) || write_fault) + lock_mode = xfs_ilock_for_write_fault(XFS_I(inode)); + if (IS_DAX(inode)) { pfn_t pfn;
- xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); ret = xfs_dax_fault(vmf, pe_size, write_fault, &pfn); if (ret & VM_FAULT_NEEDDSYNC) ret = dax_finish_sync_fault(vmf, pe_size, pfn); - xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED); + } else if (write_fault) { + ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops); } else { - if (write_fault) { - xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); - ret = iomap_page_mkwrite(vmf, - &xfs_page_mkwrite_iomap_ops); - xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED); - } else { - ret = filemap_fault(vmf); - } + ret = filemap_fault(vmf); }
+ if (lock_mode) + xfs_iunlock(XFS_I(inode), lock_mode); + if (write_fault) sb_end_pagefault(inode->i_sb); return ret; }
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 1d32823d5099..dc84c75be852 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -3642,10 +3642,27 @@ xfs_iunlock2_io_mmap( inode_unlock(VFS_I(ip2)); if (ip1 != ip2) inode_unlock(VFS_I(ip1)); }
+/* Drop the MMAPLOCK and the IOLOCK after a remap completes. */ +void +xfs_iunlock2_remapping( + struct xfs_inode *ip1, + struct xfs_inode *ip2) +{ + xfs_iflags_clear(ip1, XFS_IREMAPPING); + + if (ip1 != ip2) + xfs_iunlock(ip1, XFS_MMAPLOCK_SHARED); + xfs_iunlock(ip2, XFS_MMAPLOCK_EXCL); + + if (ip1 != ip2) + inode_unlock_shared(VFS_I(ip1)); + inode_unlock(VFS_I(ip2)); +} + /* * Reload the incore inode list for this inode. Caller should ensure that * the link count cannot change, either by taking ILOCK_SHARED or otherwise * preventing other threads from executing. */ diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 85395ad2859c..3a81477c7797 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -345,10 +345,18 @@ static inline bool xfs_inode_has_large_extent_counts(struct xfs_inode *ip) #define XFS_INACTIVATING (1 << 13)
/* Quotacheck is running but inode has not been added to quota counts. */ #define XFS_IQUOTAUNCHECKED (1 << 14)
+/* + * Remap in progress. Callers that wish to update file data while + * holding a shared IOLOCK or MMAPLOCK must drop the lock and retake + * the lock in exclusive mode. Relocking the file will block until + * IREMAPPING is cleared. + */ +#define XFS_IREMAPPING (1U << 15) + /* All inode state flags related to inode reclaim. */ #define XFS_ALL_IRECLAIM_FLAGS (XFS_IRECLAIMABLE | \ XFS_IRECLAIM | \ XFS_NEED_INACTIVE | \ XFS_INACTIVATING) @@ -593,10 +601,11 @@ bool xfs_inode_needs_inactive(struct xfs_inode *ip);
void xfs_end_io(struct work_struct *work);
int xfs_ilock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); +void xfs_iunlock2_remapping(struct xfs_inode *ip1, struct xfs_inode *ip2);
static inline bool xfs_inode_unlinked_incomplete( struct xfs_inode *ip) { diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index fe46bce8cae6..004f5a0444be 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1537,10 +1537,14 @@ xfs_reflink_remap_prep( ret = xfs_flush_unmap_range(dest, pos_out, *len); } if (ret) goto out_unlock;
+ xfs_iflags_set(src, XFS_IREMAPPING); + if (inode_in != inode_out) + xfs_ilock_demote(src, XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL); + return 0; out_unlock: xfs_iunlock2_io_mmap(src, dest); return ret; }
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 14a537983b228cb050ceca3a5b743d01315dc4aa
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Catherine Hoangcatherine.hoang@oracle.com
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: d7d84772c3f0) 6.1.y | Present (different SHA1: 9e20b44a856b)
Note: The patch differs from the upstream commit: --- 1: 14a537983b228 ! 1: a0286e9750934 xfs: allow read IO and FICLONE to run concurrently @@ Metadata ## Commit message ## xfs: allow read IO and FICLONE to run concurrently
+ [ Upstream commit 14a537983b228cb050ceca3a5b743d01315dc4aa ] + One of our VM cluster management products needs to snapshot KVM image files so that they can be restored in case of failure. Snapshotting is done by redirecting VM disk writes to a sidecar file and using reflink @@ Commit message Reviewed-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/xfs_file.c ## @@ fs/xfs/xfs_file.c: xfs_ilock_iocb( @@ fs/xfs/xfs_file.c: xfs_file_remap_range( + xfs_iunlock2_remapping(src, dest); if (ret) trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_); - return remapped > 0 ? remapped : ret; + /* @@ fs/xfs/xfs_file.c: __xfs_filemap_fault( struct inode *inode = file_inode(vmf->vma->vm_file); struct xfs_inode *ip = XFS_I(inode); vm_fault_t ret; + unsigned int lock_mode = 0;
- trace_xfs_filemap_fault(ip, order, write_fault); + trace_xfs_filemap_fault(ip, pe_size, write_fault);
@@ fs/xfs/xfs_file.c: __xfs_filemap_fault( file_update_time(vmf->vma->vm_file); @@ fs/xfs/xfs_file.c: __xfs_filemap_fault( pfn_t pfn;
- xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); - ret = xfs_dax_fault(vmf, order, write_fault, &pfn); + ret = xfs_dax_fault(vmf, pe_size, write_fault, &pfn); if (ret & VM_FAULT_NEEDDSYNC) - ret = dax_finish_sync_fault(vmf, order, pfn); + ret = dax_finish_sync_fault(vmf, pe_size, pfn); - xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED); + } else if (write_fault) { + ret = iomap_page_mkwrite(vmf, &xfs_page_mkwrite_iomap_ops); ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Long Li leo.lilong@huawei.com
[ Upstream commit 2a5db859c6825b5d50377dda9c3cc729c20cad43 ]
Factor out xfs_defer_pending_abort() from xfs_defer_trans_abort(), which not use transaction parameter, so it can be used after the transaction life cycle.
Signed-off-by: Long Li leo.lilong@huawei.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/libxfs/xfs_defer.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c index 5a321b783398..c9adb649e9b3 100644 --- a/fs/xfs/libxfs/xfs_defer.c +++ b/fs/xfs/libxfs/xfs_defer.c @@ -243,32 +243,39 @@ xfs_defer_create_intents( ret |= ret2; } return ret; }
-/* Abort all the intents that were committed. */ STATIC void -xfs_defer_trans_abort( - struct xfs_trans *tp, - struct list_head *dop_pending) +xfs_defer_pending_abort( + struct xfs_mount *mp, + struct list_head *dop_list) { struct xfs_defer_pending *dfp; const struct xfs_defer_op_type *ops;
- trace_xfs_defer_trans_abort(tp, _RET_IP_); - /* Abort intent items that don't have a done item. */ - list_for_each_entry(dfp, dop_pending, dfp_list) { + list_for_each_entry(dfp, dop_list, dfp_list) { ops = defer_op_types[dfp->dfp_type]; - trace_xfs_defer_pending_abort(tp->t_mountp, dfp); + trace_xfs_defer_pending_abort(mp, dfp); if (dfp->dfp_intent && !dfp->dfp_done) { ops->abort_intent(dfp->dfp_intent); dfp->dfp_intent = NULL; } } }
+/* Abort all the intents that were committed. */ +STATIC void +xfs_defer_trans_abort( + struct xfs_trans *tp, + struct list_head *dop_pending) +{ + trace_xfs_defer_trans_abort(tp, _RET_IP_); + xfs_defer_pending_abort(tp->t_mountp, dop_pending); +} + /* * Capture resources that the caller said not to release ("held") when the * transaction commits. Caller is responsible for zero-initializing @dres. */ static int
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 2a5db859c6825b5d50377dda9c3cc729c20cad43
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Long Lileo.lilong@huawei.com
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 23f3d79fc983) 6.1.y | Present (different SHA1: d28caa7f7c3b)
Note: The patch differs from the upstream commit: --- 1: 2a5db859c6825 ! 1: 79526dfb16ec1 xfs: factor out xfs_defer_pending_abort @@ Metadata ## Commit message ## xfs: factor out xfs_defer_pending_abort
+ [ Upstream commit 2a5db859c6825b5d50377dda9c3cc729c20cad43 ] + Factor out xfs_defer_pending_abort() from xfs_defer_trans_abort(), which not use transaction parameter, so it can be used after the transaction life cycle. @@ Commit message Signed-off-by: Long Li leo.lilong@huawei.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/libxfs/xfs_defer.c ## @@ fs/xfs/libxfs/xfs_defer.c: xfs_defer_create_intents( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Long Li leo.lilong@huawei.com
[ Upstream commit f8f9d952e42dd49ae534f61f2fa7ca0876cb9848 ]
When recovering intents, we capture newly created intent items as part of committing recovered intent items. If intent recovery fails at a later point, we forget to remove those newly created intent items from the AIL and hang:
[root@localhost ~]# cat /proc/539/stack [<0>] xfs_ail_push_all_sync+0x174/0x230 [<0>] xfs_unmount_flush_inodes+0x8d/0xd0 [<0>] xfs_mountfs+0x15f7/0x1e70 [<0>] xfs_fs_fill_super+0x10ec/0x1b20 [<0>] get_tree_bdev+0x3c8/0x730 [<0>] vfs_get_tree+0x89/0x2c0 [<0>] path_mount+0xecf/0x1800 [<0>] do_mount+0xf3/0x110 [<0>] __x64_sys_mount+0x154/0x1f0 [<0>] do_syscall_64+0x39/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
When newly created intent items fail to commit via transaction, intent recovery hasn't created done items for these newly created intent items, so the capture structure is the sole owner of the captured intent items. We must release them explicitly or else they leak:
unreferenced object 0xffff888016719108 (size 432): comm "mount", pid 529, jiffies 4294706839 (age 144.463s) hex dump (first 32 bytes): 08 91 71 16 80 88 ff ff 08 91 71 16 80 88 ff ff ..q.......q..... 18 91 71 16 80 88 ff ff 18 91 71 16 80 88 ff ff ..q.......q..... backtrace: [<ffffffff8230c68f>] xfs_efi_init+0x18f/0x1d0 [<ffffffff8230c720>] xfs_extent_free_create_intent+0x50/0x150 [<ffffffff821b671a>] xfs_defer_create_intents+0x16a/0x340 [<ffffffff821bac3e>] xfs_defer_ops_capture_and_commit+0x8e/0xad0 [<ffffffff82322bb9>] xfs_cui_item_recover+0x819/0x980 [<ffffffff823289b6>] xlog_recover_process_intents+0x246/0xb70 [<ffffffff8233249a>] xlog_recover_finish+0x8a/0x9a0 [<ffffffff822eeafb>] xfs_log_mount_finish+0x2bb/0x4a0 [<ffffffff822c0f4f>] xfs_mountfs+0x14bf/0x1e70 [<ffffffff822d1f80>] xfs_fs_fill_super+0x10d0/0x1b20 [<ffffffff81a21fa2>] get_tree_bdev+0x3d2/0x6d0 [<ffffffff81a1ee09>] vfs_get_tree+0x89/0x2c0 [<ffffffff81a9f35f>] path_mount+0xecf/0x1800 [<ffffffff81a9fd83>] do_mount+0xf3/0x110 [<ffffffff81aa00e4>] __x64_sys_mount+0x154/0x1f0 [<ffffffff83968739>] do_syscall_64+0x39/0x80
Fix the problem above by abort intent items that don't have a done item when recovery intents fail.
Fixes: e6fff81e4870 ("xfs: proper replay of deferred ops queued during log recovery") Signed-off-by: Long Li leo.lilong@huawei.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/libxfs/xfs_defer.c | 5 +++-- fs/xfs/libxfs/xfs_defer.h | 2 +- fs/xfs/xfs_log_recover.c | 2 +- 3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c index c9adb649e9b3..92470ed3fcbd 100644 --- a/fs/xfs/libxfs/xfs_defer.c +++ b/fs/xfs/libxfs/xfs_defer.c @@ -759,16 +759,17 @@ xfs_defer_ops_capture( return dfc; }
/* Release all resources that we used to capture deferred ops. */ void -xfs_defer_ops_capture_free( +xfs_defer_ops_capture_abort( struct xfs_mount *mp, struct xfs_defer_capture *dfc) { unsigned short i;
+ xfs_defer_pending_abort(mp, &dfc->dfc_dfops); xfs_defer_cancel_list(mp, &dfc->dfc_dfops);
for (i = 0; i < dfc->dfc_held.dr_bufs; i++) xfs_buf_relse(dfc->dfc_held.dr_bp[i]);
@@ -805,11 +806,11 @@ xfs_defer_ops_capture_and_commit( return xfs_trans_commit(tp);
/* Commit the transaction and add the capture structure to the list. */ error = xfs_trans_commit(tp); if (error) { - xfs_defer_ops_capture_free(mp, dfc); + xfs_defer_ops_capture_abort(mp, dfc); return error; }
list_add_tail(&dfc->dfc_list, capture_list); return 0; diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h index 114a3a4930a3..8788ad5f6a73 100644 --- a/fs/xfs/libxfs/xfs_defer.h +++ b/fs/xfs/libxfs/xfs_defer.h @@ -119,11 +119,11 @@ struct xfs_defer_capture { */ int xfs_defer_ops_capture_and_commit(struct xfs_trans *tp, struct list_head *capture_list); void xfs_defer_ops_continue(struct xfs_defer_capture *d, struct xfs_trans *tp, struct xfs_defer_resources *dres); -void xfs_defer_ops_capture_free(struct xfs_mount *mp, +void xfs_defer_ops_capture_abort(struct xfs_mount *mp, struct xfs_defer_capture *d); void xfs_defer_resources_rele(struct xfs_defer_resources *dres);
int __init xfs_defer_init_item_caches(void); void xfs_defer_destroy_item_caches(void); diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 006a376c34b2..e009bb23d8a2 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2512,11 +2512,11 @@ xlog_abort_defer_ops( struct xfs_defer_capture *dfc; struct xfs_defer_capture *next;
list_for_each_entry_safe(dfc, next, capture_list, dfc_list) { list_del_init(&dfc->dfc_list); - xfs_defer_ops_capture_free(mp, dfc); + xfs_defer_ops_capture_abort(mp, dfc); } }
/* * When this is called, all of the log intent items which did not have
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: f8f9d952e42dd49ae534f61f2fa7ca0876cb9848
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Long Lileo.lilong@huawei.com
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 005be6684225) 6.1.y | Present (different SHA1: 6beaebf68934)
Note: The patch differs from the upstream commit: --- 1: f8f9d952e42dd ! 1: 7e1e53b9da92c xfs: abort intent items when recovery intents fail @@ Metadata ## Commit message ## xfs: abort intent items when recovery intents fail
+ [ Upstream commit f8f9d952e42dd49ae534f61f2fa7ca0876cb9848 ] + When recovering intents, we capture newly created intent items as part of committing recovered intent items. If intent recovery fails at a later point, we forget to remove those newly created intent items from the AIL @@ Commit message Signed-off-by: Long Li leo.lilong@huawei.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/libxfs/xfs_defer.c ## @@ fs/xfs/libxfs/xfs_defer.c: xfs_defer_ops_capture( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Christoph Hellwig hch@lst.de
[ Upstream commit 55f669f34184ecb25b8353f29c7f6f1ae5b313d1 ]
xfs_reflink_end_cow_extent looks up the COW extent and the data fork extent at offset_fsb, and then proceeds to remap the common subset between the two.
It does however not limit the remapped extent to the passed in [*offset_fsbm end_fsb] range and thus potentially remaps more blocks than the one handled by the current I/O completion. This means that with sufficiently large data and COW extents we could be remapping COW fork mappings that have not been written to, leading to a stale data exposure on a powerfail event.
We use to have a xfs_trim_range to make the remap fit the I/O completion range, but that got (apparently accidentally) removed in commit df2fd88f8ac7 ("xfs: rewrite xfs_reflink_end_cow to use intents").
Note that I've only found this by code inspection, and a test case would probably require very specific delay and error injection.
Fixes: df2fd88f8ac7 ("xfs: rewrite xfs_reflink_end_cow to use intents") Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_reflink.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 004f5a0444be..cbdc23217a42 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -781,10 +781,11 @@ xfs_reflink_end_cow_extent( *offset_fsb = end_fsb; goto out_cancel; } } del = got; + xfs_trim_extent(&del, *offset_fsb, end_fsb - *offset_fsb);
/* Grab the corresponding mapping in the data fork. */ nmaps = 1; error = xfs_bmapi_read(ip, del.br_startoff, del.br_blockcount, &data, &nmaps, 0);
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 55f669f34184ecb25b8353f29c7f6f1ae5b313d1
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Christoph Hellwighch@lst.de
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 767a94d81616) 6.1.y | Present (different SHA1: 09e751cf562d)
Note: The patch differs from the upstream commit: --- 1: 55f669f34184e ! 1: 7f296145cb8e6 xfs: only remap the written blocks in xfs_reflink_end_cow_extent @@ Metadata ## Commit message ## xfs: only remap the written blocks in xfs_reflink_end_cow_extent
+ [ Upstream commit 55f669f34184ecb25b8353f29c7f6f1ae5b313d1 ] + xfs_reflink_end_cow_extent looks up the COW extent and the data fork extent at offset_fsb, and then proceeds to remap the common subset between the two. @@ Commit message Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/xfs_reflink.c ## @@ fs/xfs/xfs_reflink.c: xfs_reflink_end_cow_extent( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
[ Upstream commit 471de20303dda0b67981e06d59cc6c4a83fd2a3c ]
We flush the data device cache before we issue external log IO. If the flush fails, we shut down the log immediately and return. However, the iclog->ic_sema is left in a decremented state so let's add an up(). Prior to this patch, xfs/438 would fail consistently when running with an external log device:
sync -> xfs_log_force -> xlog_write_iclog -> down(&iclog->ic_sema) -> blkdev_issue_flush (fail causes us to intiate shutdown) -> xlog_force_shutdown -> return
unmount -> xfs_log_umount -> xlog_wait_iclog_completion -> down(&iclog->ic_sema) --------> HANG
There is a second early return / shutdown. Make sure the up() happens for it as well. Also make sure we cleanup the iclog state, xlog_state_done_syncing, before dropping the iclog lock.
Fixes: b5d721eaae47 ("xfs: external logs need to flush data device") Fixes: 842a42d126b4 ("xfs: shutdown on failure to add page to log bio") Fixes: 7d839e325af2 ("xfs: check return codes when flushing block devices") Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_log.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c index 59c982297503..ce6b303484cf 100644 --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -1889,13 +1889,11 @@ xlog_write_iclog( * the log state machine to propagate I/O errors instead of * doing it here. We kick of the state machine and unlock * the buffer manually, the code needs to be kept in sync * with the I/O completion path. */ - xlog_state_done_syncing(iclog); - up(&iclog->ic_sema); - return; + goto sync; }
/* * We use REQ_SYNC | REQ_IDLE here to tell the block layer the are more * IOs coming immediately after this one. This prevents the block layer @@ -1921,44 +1919,47 @@ xlog_write_iclog( * writeback from the log succeeded. Repeating the flush is * not possible, hence we must shut down with log IO error to * avoid shutdown re-entering this path and erroring out again. */ if (log->l_targ != log->l_mp->m_ddev_targp && - blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev)) { - xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR); - return; - } + blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev)) + goto shutdown; } if (iclog->ic_flags & XLOG_ICL_NEED_FUA) iclog->ic_bio.bi_opf |= REQ_FUA;
iclog->ic_flags &= ~(XLOG_ICL_NEED_FLUSH | XLOG_ICL_NEED_FUA);
- if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count)) { - xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR); - return; - } + if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count)) + goto shutdown; + if (is_vmalloc_addr(iclog->ic_data)) flush_kernel_vmap_range(iclog->ic_data, count);
/* * If this log buffer would straddle the end of the log we will have * to split it up into two bios, so that we can continue at the start. */ if (bno + BTOBB(count) > log->l_logBBsize) { struct bio *split;
split = bio_split(&iclog->ic_bio, log->l_logBBsize - bno, GFP_NOIO, &fs_bio_set); bio_chain(split, &iclog->ic_bio); submit_bio(split);
/* restart at logical offset zero for the remainder */ iclog->ic_bio.bi_iter.bi_sector = log->l_logBBstart; }
submit_bio(&iclog->ic_bio); + return; +shutdown: + xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR); +sync: + xlog_state_done_syncing(iclog); + up(&iclog->ic_sema); }
/* * We need to bump cycle number for the part of the iclog that is * written to the start of the log. Watch out for the header magic
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 471de20303dda0b67981e06d59cc6c4a83fd2a3c
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: c86562e6918a) 6.1.y | Present (different SHA1: b8a7d6e7d0bb)
Note: The patch differs from the upstream commit: --- 1: 471de20303dda ! 1: 9a80b1e4b9f8a xfs: up(ic_sema) if flushing data device fails @@ Metadata ## Commit message ## xfs: up(ic_sema) if flushing data device fails
+ [ Upstream commit 471de20303dda0b67981e06d59cc6c4a83fd2a3c ] + We flush the data device cache before we issue external log IO. If the flush fails, we shut down the log immediately and return. However, the iclog->ic_sema is left in a decremented state so let's add an up(). @@ Commit message Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/xfs_log.c ## @@ fs/xfs/xfs_log.c: xlog_write_iclog( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Omar Sandoval osandov@fb.com
[ Upstream commit f63a5b3769ad7659da4c0420751d78958ab97675 ]
We've been seeing XFS errors like the following:
XFS: Internal error i != 1 at line 3526 of file fs/xfs/libxfs/xfs_btree.c. Caller xfs_btree_insert+0x1ec/0x280 ... Call Trace: xfs_corruption_error+0x94/0xa0 xfs_btree_insert+0x221/0x280 xfs_alloc_fixup_trees+0x104/0x3e0 xfs_alloc_ag_vextent_size+0x667/0x820 xfs_alloc_fix_freelist+0x5d9/0x750 xfs_free_extent_fix_freelist+0x65/0xa0 __xfs_free_extent+0x57/0x180 ...
This is the XFS_IS_CORRUPT() check in xfs_btree_insert() when xfs_btree_insrec() fails.
After converting this into a panic and dissecting the core dump, I found that xfs_btree_insrec() is failing because it's trying to split a leaf node in the cntbt when the AG free list is empty. In particular, it's failing to get a block from the AGFL _while trying to refill the AGFL_.
If a single operation splits every level of the bnobt and the cntbt (and the rmapbt if it is enabled) at once, the free list will be empty. Then, when the next operation tries to refill the free list, it allocates space. If the allocation does not use a full extent, it will need to insert records for the remaining space in the bnobt and cntbt. And if those new records go in full leaves, the leaves (and potentially more nodes up to the old root) need to be split.
Fix it by accounting for the additional splits that may be required to refill the free list in the calculation for the minimum free list size.
P.S. As far as I can tell, this bug has existed for a long time -- maybe back to xfs-history commit afdf80ae7405 ("Add XFS_AG_MAXLEVELS macros ...") in April 1994! It requires a very unlucky sequence of events, and in fact we didn't hit it until a particular sparse mmap workload updated from 5.12 to 5.19. But this bug existed in 5.12, so it must've been exposed by some other change in allocation or writeback patterns. It's also much less likely to be hit with the rmapbt enabled, since that increases the minimum free list size and is unlikely to split at the same time as the bnobt and cntbt.
Reviewed-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Omar Sandoval osandov@fb.com Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/libxfs/xfs_alloc.c | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 8bb024b06b95..74d039bdc9f7 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -2271,20 +2271,41 @@ xfs_alloc_min_freelist( const uint8_t *levels = pag ? pag->pagf_levels : fake_levels; unsigned int min_free;
ASSERT(mp->m_alloc_maxlevels > 0);
+ /* + * For a btree shorter than the maximum height, the worst case is that + * every level gets split and a new level is added, then while inserting + * another entry to refill the AGFL, every level under the old root gets + * split again. This is: + * + * (full height split reservation) + (AGFL refill split height) + * = (current height + 1) + (current height - 1) + * = (new height) + (new height - 2) + * = 2 * new height - 2 + * + * For a btree of maximum height, the worst case is that every level + * under the root gets split, then while inserting another entry to + * refill the AGFL, every level under the root gets split again. This is + * also: + * + * 2 * (current height - 1) + * = 2 * (new height - 1) + * = 2 * new height - 2 + */ + /* space needed by-bno freespace btree */ min_free = min_t(unsigned int, levels[XFS_BTNUM_BNOi] + 1, - mp->m_alloc_maxlevels); + mp->m_alloc_maxlevels) * 2 - 2; /* space needed by-size freespace btree */ min_free += min_t(unsigned int, levels[XFS_BTNUM_CNTi] + 1, - mp->m_alloc_maxlevels); + mp->m_alloc_maxlevels) * 2 - 2; /* space needed reverse mapping used space btree */ if (xfs_has_rmapbt(mp)) min_free += min_t(unsigned int, levels[XFS_BTNUM_RMAPi] + 1, - mp->m_rmap_maxlevels); + mp->m_rmap_maxlevels) * 2 - 2;
return min_free; }
/*
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: f63a5b3769ad7659da4c0420751d78958ab97675
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Omar Sandovalosandov@fb.com
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 0838177b012b) 6.1.y | Present (different SHA1: 4b83cf86531e)
Note: The patch differs from the upstream commit: --- 1: f63a5b3769ad7 ! 1: c15b0d3490300 xfs: fix internal error from AGFL exhaustion @@ Metadata ## Commit message ## xfs: fix internal error from AGFL exhaustion
+ [ Upstream commit f63a5b3769ad7659da4c0420751d78958ab97675 ] + We've been seeing XFS errors like the following:
XFS: Internal error i != 1 at line 3526 of file fs/xfs/libxfs/xfs_btree.c. Caller xfs_btree_insert+0x1ec/0x280 @@ Commit message Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Omar Sandoval osandov@fb.com Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/libxfs/xfs_alloc.c ## @@ fs/xfs/libxfs/xfs_alloc.c: xfs_alloc_min_freelist( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Dave Chinner dchinner@redhat.com
[ Upstream commit 038ca189c0d2c1570b4d922f25b524007c85cf94 ]
Discovered when trying to track down a weird recovery corruption issue that wasn't detected at recovery time.
The specific corruption was a zero extent count field when big extent counts are in use, and it turns out the dinode verifier doesn't detect that specific corruption case, either. So fix it too.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/libxfs/xfs_inode_buf.c | 3 +++ fs/xfs/xfs_inode_item_recover.c | 14 +++++++++++++- 2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 758aacd8166b..601b05ca5fc2 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -505,10 +505,13 @@ xfs_dinode_verify(
/* Fork checks carried over from xfs_iformat_fork */ if (mode && nextents + naextents > nblocks) return __this_address;
+ if (nextents + naextents == 0 && nblocks != 0) + return __this_address; + if (S_ISDIR(mode) && nextents > mp->m_dir_geo->max_extents) return __this_address;
if (mode && XFS_DFORK_BOFF(dip) > mp->m_sb.sb_inodesize) return __this_address; diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c index e6609067ef26..144198a6b270 100644 --- a/fs/xfs/xfs_inode_item_recover.c +++ b/fs/xfs/xfs_inode_item_recover.c @@ -284,10 +284,11 @@ xlog_recover_inode_commit_pass2( int attr_index; uint fields; struct xfs_log_dinode *ldip; uint isize; int need_free = 0; + xfs_failaddr_t fa;
if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) { in_f = item->ri_buf[0].i_addr; } else { in_f = kmem_alloc(sizeof(struct xfs_inode_log_format), 0); @@ -528,12 +529,23 @@ xlog_recover_inode_commit_pass2( /* Recover the swapext owner change unless inode has been deleted */ if ((in_f->ilf_fields & (XFS_ILOG_DOWNER|XFS_ILOG_AOWNER)) && (dip->di_mode != 0)) error = xfs_recover_inode_owner_change(mp, dip, in_f, buffer_list); - /* re-generate the checksum. */ + /* re-generate the checksum and validate the recovered inode. */ xfs_dinode_calc_crc(log->l_mp, dip); + fa = xfs_dinode_verify(log->l_mp, in_f->ilf_ino, dip); + if (fa) { + XFS_CORRUPTION_ERROR( + "Bad dinode after recovery", + XFS_ERRLEVEL_LOW, mp, dip, sizeof(*dip)); + xfs_alert(mp, + "Metadata corruption detected at %pS, inode 0x%llx", + fa, in_f->ilf_ino); + error = -EFSCORRUPTED; + goto out_release; + }
ASSERT(bp->b_mount == mp); bp->b_flags |= _XBF_LOGRECOVERY; xfs_buf_delwri_queue(bp, buffer_list);
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 038ca189c0d2c1570b4d922f25b524007c85cf94
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Dave Chinnerdchinner@redhat.com
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: b28b234276a8) 6.1.y | Present (different SHA1: e8f4f518d29f)
Note: The patch differs from the upstream commit: --- 1: 038ca189c0d2c ! 1: 8f643a2b86e6b xfs: inode recovery does not validate the recovered inode @@ Metadata ## Commit message ## xfs: inode recovery does not validate the recovered inode
+ [ Upstream commit 038ca189c0d2c1570b4d922f25b524007c85cf94 ] + Discovered when trying to track down a weird recovery corruption issue that wasn't detected at recovery time.
@@ Commit message Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/libxfs/xfs_inode_buf.c ## @@ fs/xfs/libxfs/xfs_inode_buf.c: xfs_dinode_verify( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit ed17f7da5f0c8b65b7b5f7c98beb0aadbc0546ee ]
Since the introduction of xfs_dqblk in V5, xfs really ought to find the dqblk pointer from the dquot buffer, then compute the xfs_disk_dquot pointer from the dqblk pointer. Fix the open-coded xfs_buf_offset calls and do the type checking in the correct order.
Note that this has made no practical difference since the start of the xfs_disk_dquot is coincident with the start of the xfs_dqblk.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_dquot.c | 5 +++-- fs/xfs/xfs_dquot_item_recover.c | 7 ++++--- 2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index 7f071757f278..a8b2f3b278ea 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -560,11 +560,12 @@ xfs_dquot_check_type( STATIC int xfs_dquot_from_disk( struct xfs_dquot *dqp, struct xfs_buf *bp) { - struct xfs_disk_dquot *ddqp = bp->b_addr + dqp->q_bufoffset; + struct xfs_dqblk *dqb = xfs_buf_offset(bp, dqp->q_bufoffset); + struct xfs_disk_dquot *ddqp = &dqb->dd_diskdq;
/* * Ensure that we got the type and ID we were looking for. * Everything else was checked by the dquot buffer verifier. */ @@ -1248,11 +1249,11 @@ xfs_qm_dqflush( error = -EFSCORRUPTED; goto out_abort; }
/* Flush the incore dquot to the ondisk buffer. */ - dqblk = bp->b_addr + dqp->q_bufoffset; + dqblk = xfs_buf_offset(bp, dqp->q_bufoffset); xfs_dquot_to_disk(&dqblk->dd_diskdq, dqp);
/* * Clear the dirty field and remember the flush lsn for later use. */ diff --git a/fs/xfs/xfs_dquot_item_recover.c b/fs/xfs/xfs_dquot_item_recover.c index 8966ba842395..db2cb5e4197b 100644 --- a/fs/xfs/xfs_dquot_item_recover.c +++ b/fs/xfs/xfs_dquot_item_recover.c @@ -63,10 +63,11 @@ xlog_recover_dquot_commit_pass2( struct xlog_recover_item *item, xfs_lsn_t current_lsn) { struct xfs_mount *mp = log->l_mp; struct xfs_buf *bp; + struct xfs_dqblk *dqb; struct xfs_disk_dquot *ddq, *recddq; struct xfs_dq_logformat *dq_f; xfs_failaddr_t fa; int error; uint type; @@ -128,28 +129,28 @@ xlog_recover_dquot_commit_pass2( &xfs_dquot_buf_ops); if (error) return error;
ASSERT(bp); - ddq = xfs_buf_offset(bp, dq_f->qlf_boffset); + dqb = xfs_buf_offset(bp, dq_f->qlf_boffset); + ddq = &dqb->dd_diskdq;
/* * If the dquot has an LSN in it, recover the dquot only if it's less * than the lsn of the transaction we are replaying. */ if (xfs_has_crc(mp)) { - struct xfs_dqblk *dqb = (struct xfs_dqblk *)ddq; xfs_lsn_t lsn = be64_to_cpu(dqb->dd_lsn);
if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) { goto out_release; } }
memcpy(ddq, recddq, item->ri_buf[1].i_len); if (xfs_has_crc(mp)) { - xfs_update_cksum((char *)ddq, sizeof(struct xfs_dqblk), + xfs_update_cksum((char *)dqb, sizeof(struct xfs_dqblk), XFS_DQUOT_CRC_OFF); }
ASSERT(dq_f->qlf_size == 2); ASSERT(bp->b_mount == mp);
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: ed17f7da5f0c8b65b7b5f7c98beb0aadbc0546ee
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: d744e578802a) 6.1.y | Present (different SHA1: 1fd830d98732)
Note: The patch differs from the upstream commit: --- 1: ed17f7da5f0c8 ! 1: ac9397950de5e xfs: clean up dqblk extraction @@ Metadata ## Commit message ## xfs: clean up dqblk extraction
+ [ Upstream commit ed17f7da5f0c8b65b7b5f7c98beb0aadbc0546ee ] + Since the introduction of xfs_dqblk in V5, xfs really ought to find the dqblk pointer from the dquot buffer, then compute the xfs_disk_dquot pointer from the dqblk pointer. Fix the open-coded xfs_buf_offset calls @@ Commit message Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/xfs_dquot.c ## @@ fs/xfs/xfs_dquot.c: xfs_dquot_from_disk( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 9c235dfc3d3f901fe22acb20f2ab37ff39f2ce02 ]
When we're recovering ondisk quota records from the log, we need to validate the recovered buffer contents before writing them to disk.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_dquot_item_recover.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/fs/xfs/xfs_dquot_item_recover.c b/fs/xfs/xfs_dquot_item_recover.c index db2cb5e4197b..2c2720ce6923 100644 --- a/fs/xfs/xfs_dquot_item_recover.c +++ b/fs/xfs/xfs_dquot_item_recover.c @@ -17,10 +17,11 @@ #include "xfs_trans_priv.h" #include "xfs_qm.h" #include "xfs_log.h" #include "xfs_log_priv.h" #include "xfs_log_recover.h" +#include "xfs_error.h"
STATIC void xlog_recover_dquot_ra_pass2( struct xlog *log, struct xlog_recover_item *item) @@ -150,10 +151,23 @@ xlog_recover_dquot_commit_pass2( if (xfs_has_crc(mp)) { xfs_update_cksum((char *)dqb, sizeof(struct xfs_dqblk), XFS_DQUOT_CRC_OFF); }
+ /* Validate the recovered dquot. */ + fa = xfs_dqblk_verify(log->l_mp, dqb, dq_f->qlf_id); + if (fa) { + XFS_CORRUPTION_ERROR("Bad dquot after recovery", + XFS_ERRLEVEL_LOW, mp, dqb, + sizeof(struct xfs_dqblk)); + xfs_alert(mp, + "Metadata corruption detected at %pS, dquot 0x%x", + fa, dq_f->qlf_id); + error = -EFSCORRUPTED; + goto out_release; + } + ASSERT(dq_f->qlf_size == 2); ASSERT(bp->b_mount == mp); bp->b_flags |= _XBF_LOGRECOVERY; xfs_buf_delwri_queue(bp, buffer_list);
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 9c235dfc3d3f901fe22acb20f2ab37ff39f2ce02
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 3581868f51a2) 6.1.y | Present (different SHA1: f3eceedfd713)
Note: The patch differs from the upstream commit: --- Failed to apply patch cleanly, falling back to interdiff... ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Christoph Hellwig hch@lst.de
[ Upstream commit c421df0b19430417a04f68919fc3d1943d20ac04 ]
Introduce a local boolean variable if FS_XFLAG_REALTIME to make the checks for it more obvious, and de-densify a few of the conditionals using it to make them more readable while at it.
Signed-off-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/r/20231025141020.192413-4-hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_ioctl.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 85fbb3b71d1c..f6aa9e6138ae 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1118,27 +1118,29 @@ xfs_ioctl_setattr_xflags( struct xfs_trans *tp, struct xfs_inode *ip, struct fileattr *fa) { struct xfs_mount *mp = ip->i_mount; + bool rtflag = (fa->fsx_xflags & FS_XFLAG_REALTIME); uint64_t i_flags2;
- /* Can't change realtime flag if any extents are allocated. */ - if ((ip->i_df.if_nextents || ip->i_delayed_blks) && - XFS_IS_REALTIME_INODE(ip) != (fa->fsx_xflags & FS_XFLAG_REALTIME)) - return -EINVAL; + if (rtflag != XFS_IS_REALTIME_INODE(ip)) { + /* Can't change realtime flag if any extents are allocated. */ + if (ip->i_df.if_nextents || ip->i_delayed_blks) + return -EINVAL; + }
- /* If realtime flag is set then must have realtime device */ - if (fa->fsx_xflags & FS_XFLAG_REALTIME) { + if (rtflag) { + /* If realtime flag is set then must have realtime device */ if (mp->m_sb.sb_rblocks == 0 || mp->m_sb.sb_rextsize == 0 || (ip->i_extsize % mp->m_sb.sb_rextsize)) return -EINVAL; - }
- /* Clear reflink if we are actually able to set the rt flag. */ - if ((fa->fsx_xflags & FS_XFLAG_REALTIME) && xfs_is_reflink_inode(ip)) - ip->i_diflags2 &= ~XFS_DIFLAG2_REFLINK; + /* Clear reflink if we are actually able to set the rt flag. */ + if (xfs_is_reflink_inode(ip)) + ip->i_diflags2 &= ~XFS_DIFLAG2_REFLINK; + }
/* Don't allow us to set DAX mode for a reflinked file for now. */ if ((fa->fsx_xflags & FS_XFLAG_DAX) && xfs_is_reflink_inode(ip)) return -EINVAL;
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: c421df0b19430417a04f68919fc3d1943d20ac04
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Christoph Hellwighch@lst.de
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: d7d5ed65364c) 6.1.y | Present (different SHA1: a426d90bf5d7)
Note: The patch differs from the upstream commit: --- 1: c421df0b19430 ! 1: 3bbebf3ef4d3f xfs: clean up FS_XFLAG_REALTIME handling in xfs_ioctl_setattr_xflags @@ Metadata ## Commit message ## xfs: clean up FS_XFLAG_REALTIME handling in xfs_ioctl_setattr_xflags
+ [ Upstream commit c421df0b19430417a04f68919fc3d1943d20ac04 ] + Introduce a local boolean variable if FS_XFLAG_REALTIME to make the checks for it more obvious, and de-densify a few of the conditionals using it to make them more readable while at it. @@ Commit message Link: https://lore.kernel.org/r/20231025141020.192413-4-hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/xfs_ioctl.c ## @@ fs/xfs/xfs_ioctl.c: xfs_ioctl_setattr_xflags( @@ fs/xfs/xfs_ioctl.c: xfs_ioctl_setattr_xflags( + if (rtflag) { + /* If realtime flag is set then must have realtime device */ if (mp->m_sb.sb_rblocks == 0 || mp->m_sb.sb_rextsize == 0 || - xfs_extlen_to_rtxmod(mp, ip->i_extsize)) + (ip->i_extsize % mp->m_sb.sb_rextsize)) return -EINVAL; - }
@@ fs/xfs/xfs_ioctl.c: xfs_ioctl_setattr_xflags( + ip->i_diflags2 &= ~XFS_DIFLAG2_REFLINK; + }
- /* diflags2 only valid for v3 inodes. */ - i_flags2 = xfs_flags2diflags2(ip, fa->fsx_xflags); + /* Don't allow us to set DAX mode for a reflinked file for now. */ + if ((fa->fsx_xflags & FS_XFLAG_DAX) && xfs_is_reflink_inode(ip)) ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Christoph Hellwig hch@lst.de
[ Upstream commit 9c04138414c00ae61421f36ada002712c4bac94a ]
Update the per-folio stable writes flag dependening on which device an inode resides on.
Signed-off-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/r/20231025141020.192413-5-hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com --- fs/xfs/xfs_inode.h | 8 ++++++++ fs/xfs/xfs_ioctl.c | 8 ++++++++ fs/xfs/xfs_iops.c | 7 +++++++ 3 files changed, 23 insertions(+)
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 3a81477c7797..c177c92f3aa5 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -567,10 +567,18 @@ int xfs_break_layouts(struct inode *inode, uint *iolock, /* from xfs_iops.c */ extern void xfs_setup_inode(struct xfs_inode *ip); extern void xfs_setup_iops(struct xfs_inode *ip); extern void xfs_diflags_to_iflags(struct xfs_inode *ip, bool init);
+static inline void xfs_update_stable_writes(struct xfs_inode *ip) +{ + if (bdev_stable_writes(xfs_inode_buftarg(ip)->bt_bdev)) + mapping_set_stable_writes(VFS_I(ip)->i_mapping); + else + mapping_clear_stable_writes(VFS_I(ip)->i_mapping); +} + /* * When setting up a newly allocated inode, we need to call * xfs_finish_inode_setup() once the inode is fully instantiated at * the VFS level to prevent the rest of the world seeing the inode * before we've completed instantiation. Otherwise we can do it diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index f6aa9e6138ae..c7cb496dc345 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1151,10 +1151,18 @@ xfs_ioctl_setattr_xflags(
ip->i_diflags = xfs_flags2diflags(ip, fa->fsx_xflags); ip->i_diflags2 = i_flags2;
xfs_diflags_to_iflags(ip, false); + + /* + * Make the stable writes flag match that of the device the inode + * resides on when flipping the RT flag. + */ + if (rtflag != XFS_IS_REALTIME_INODE(ip) && S_ISREG(VFS_I(ip)->i_mode)) + xfs_update_stable_writes(ip); + xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG); xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); XFS_STATS_INC(mp, xs_ig_attrchg); return 0; } diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 2e10e1c66ad6..6fbdc0a19e54 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1289,10 +1289,17 @@ xfs_setup_inode( * stacks or deadlocking. */ gfp_mask = mapping_gfp_mask(inode->i_mapping); mapping_set_gfp_mask(inode->i_mapping, (gfp_mask & ~(__GFP_FS)));
+ /* + * For real-time inodes update the stable write flags to that of the RT + * device instead of the data device. + */ + if (S_ISREG(inode->i_mode) && XFS_IS_REALTIME_INODE(ip)) + xfs_update_stable_writes(ip); + /* * If there is no attribute fork no ACL can exist on this inode, * and it can't have any file capabilities attached to it either. */ if (!xfs_inode_has_attr_fork(ip)) {
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 9c04138414c00ae61421f36ada002712c4bac94a
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Christoph Hellwighch@lst.de
Status in newer kernel trees: 6.13.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 05955a703b75) 6.1.y | Present (different SHA1: a1118a7188ac)
Note: The patch differs from the upstream commit: --- 1: 9c04138414c00 ! 1: 89b6a2ad5ec85 xfs: respect the stable writes flag on the RT device @@ Metadata ## Commit message ## xfs: respect the stable writes flag on the RT device
+ [ Upstream commit 9c04138414c00ae61421f36ada002712c4bac94a ] + Update the per-folio stable writes flag dependening on which device an inode resides on.
@@ Commit message Link: https://lore.kernel.org/r/20231025141020.192413-5-hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com
## fs/xfs/xfs_inode.h ## @@ fs/xfs/xfs_inode.h: extern void xfs_setup_inode(struct xfs_inode *ip); ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
On Wed, Jan 29, 2025 at 10:46:58AM -0800, Leah Rumancik wrote:
Returning to focus on 6.1, here is the 6.1 set from the corresponding 6.6 set:
https://lore.kernel.org/all/20240208232054.15778-1-catherine.hoang@oracle.co...
All now queued up, thanks!
greg k-h
linux-stable-mirror@lists.linaro.org