Hello again,
This is the 6.1.y backports set from 6.10 which corresponds to the 6.6.y backports set from here:
https://lore.kernel.org/all/20241002174108.64615-1-catherine.hoang@oracle.co...
The following patches were included:
f43bd357fde0 xfs: fix error returns from xfs_bmapi_write 4bcef72d96b5 xfs: fix xfs_bmap_add_extent_delay_real for partial conversions c299188b443a xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent c13c21f77824 xfs: require XFS_SB_FEAT_INCOMPAT_LOG_XATTRS for attr log intent item recovey 0934046e3392 xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2 9716cdcc2f9e xfs: validate recovered name buffers when recovering xattr items 20adb1e2f069 xfs: revert commit 44af6c7e59b12 f24ba2183148 xfs: match lock mode in xfs_buffered_write_iomap_begin() 0f726c17dfd8 xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional 36081fd0ee37 xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset 4c99f3026cf2 xfs: convert delayed extents to unwritten when zeroing post eof blocks 0aca73915dc1 xfs: allow symlinks with short remote targets 0e52b98bf041 xfs: make sure sb_fdblocks is non-negative 2bc2d49c36c2 xfs: fix freeing speculative preallocations for preallocated files 3eeac3311683 xfs: allow unlinked symlinks and dirs with zero size 9a0ab4fc28ed xfs: restrict when we try to align cow fork delalloc to cowextsz hints
The following patches were omitted:
cad051826d83 xfs: fix missing check for invalid attr flags scrub db460c26f0b0 xfs: check shortform attr entry flags specifically scrub 5689d2345a01 xfs: enforce one namespace per attribute doesn't cherry pick cleanly, it is some refactoring and isn't a dependency for other patches, lets skip it for now 7c03b124353a xfs: use dontcache for grabbing inodes during scrub scrub 740a427e8f45 xfs: fix unlink vs cluster buffer instantiation race already in 6.1.y
No fixes were found on mainline for any of the patches being ported.
The auto group was run 1x on each of these configs:
xfs/4k xfs/1k xfs/logdev xfs/realtime xfs/quota xfs/v4 xfs/dax xfs/adv xfs/dirblock_8k
and no regressions were seen. This set has already been ack'd on the xfs-stable list.
Let me know if you see any issues. Thanks, Leah
Christoph Hellwig (4): xfs: fix error returns from xfs_bmapi_write xfs: fix xfs_bmap_add_extent_delay_real for partial conversions xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent xfs: fix freeing speculative preallocations for preallocated files
Darrick J. Wong (7): xfs: require XFS_SB_FEAT_INCOMPAT_LOG_XATTRS for attr log intent item recovery xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2 xfs: validate recovered name buffers when recovering xattr items xfs: revert commit 44af6c7e59b12 xfs: allow symlinks with short remote targets xfs: allow unlinked symlinks and dirs with zero size xfs: restrict when we try to align cow fork delalloc to cowextsz hints
Wengang Wang (1): xfs: make sure sb_fdblocks is non-negative
Zhang Yi (4): xfs: match lock mode in xfs_buffered_write_iomap_begin() xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset xfs: convert delayed extents to unwritten when zeroing post eof blocks
fs/xfs/libxfs/xfs_attr_remote.c | 1 - fs/xfs/libxfs/xfs_bmap.c | 130 ++++++++++++++++++++++++++------ fs/xfs/libxfs/xfs_da_btree.c | 20 ++--- fs/xfs/libxfs/xfs_inode_buf.c | 47 ++++++++++-- fs/xfs/libxfs/xfs_sb.c | 7 +- fs/xfs/scrub/attr.c | 5 ++ fs/xfs/xfs_aops.c | 54 ++++--------- fs/xfs/xfs_attr_item.c | 88 ++++++++++++++++++--- fs/xfs/xfs_bmap_util.c | 61 +++++++++------ fs/xfs/xfs_bmap_util.h | 2 +- fs/xfs/xfs_dquot.c | 1 - fs/xfs/xfs_icache.c | 2 +- fs/xfs/xfs_inode.c | 14 +--- fs/xfs/xfs_iomap.c | 81 +++++++++++--------- fs/xfs/xfs_reflink.c | 20 ----- fs/xfs/xfs_rtalloc.c | 2 - 16 files changed, 342 insertions(+), 193 deletions(-)
From: Christoph Hellwig hch@lst.de
[ Upstream commit 6773da870ab89123d1b513da63ed59e32a29cb77 ]
xfs_bmapi_write can return 0 without actually returning a mapping in mval in two different cases:
1) when there is absolutely no space available to do an allocation 2) when converting delalloc space, and the allocation is so small that it only covers parts of the delalloc extent before the range requested by the caller
Callers at best can handle one of these cases, but in many cases can't cope with either one. Switch xfs_bmapi_write to always return a mapping or return an error code instead. For case 1) above ENOSPC is the obvious choice which is very much what the callers expect anyway. For case 2) there is no really good error code, so pick a funky one from the SysV streams portfolio.
This fixes the reproducer here:
https://lore.kernel.org/linux-xfs/CAEJPjCvT3Uag-pMTYuigEjWZHn1sGMZ0GCjVVCv29...
which uses reserved blocks to create file systems that are gravely out of space and thus cause at least xfs_file_alloc_space to hang and trigger the lack of ENOSPC handling in xfs_dquot_disk_alloc.
Note that this patch does not actually make any caller but xfs_alloc_file_space deal intelligently with case 2) above.
Signed-off-by: Christoph Hellwig hch@lst.de Reported-by: 刘通 lyutoon@gmail.com Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/libxfs/xfs_attr_remote.c | 1 - fs/xfs/libxfs/xfs_bmap.c | 46 ++++++++++++++++++++++++++------- fs/xfs/libxfs/xfs_da_btree.c | 20 ++++---------- fs/xfs/xfs_bmap_util.c | 31 +++++++++++----------- fs/xfs/xfs_dquot.c | 1 - fs/xfs/xfs_iomap.c | 8 ------ fs/xfs/xfs_reflink.c | 14 ---------- fs/xfs/xfs_rtalloc.c | 2 -- 8 files changed, 57 insertions(+), 66 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c index d440393b40eb..54de405cbab5 100644 --- a/fs/xfs/libxfs/xfs_attr_remote.c +++ b/fs/xfs/libxfs/xfs_attr_remote.c @@ -617,11 +617,10 @@ xfs_attr_rmtval_set_blk( attr->xattri_blkcnt, XFS_BMAPI_ATTRFORK, args->total, map, &nmap); if (error) return error;
- ASSERT(nmap == 1); ASSERT((map->br_startblock != DELAYSTARTBLOCK) && (map->br_startblock != HOLESTARTBLOCK));
/* roll attribute extent map forwards */ attr->xattri_lblkno += map->br_blockcount; diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 2a3b78032cb0..0235c1dd3d7e 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4111,12 +4111,14 @@ xfs_bmapi_allocate( else error = xfs_bmap_btalloc(bma); } else { error = xfs_bmap_alloc_userdata(bma); } - if (error || bma->blkno == NULLFSBLOCK) + if (error) return error; + if (bma->blkno == NULLFSBLOCK) + return -ENOSPC;
if (bma->flags & XFS_BMAPI_ZERO) { error = xfs_zero_extent(bma->ip, bma->blkno, bma->length); if (error) return error; @@ -4292,10 +4294,19 @@ xfs_bmapi_finish( /* * Map file blocks to filesystem blocks, and allocate blocks or convert the * extent state if necessary. Details behaviour is controlled by the flags * parameter. Only allocates blocks from a single allocation group, to avoid * locking problems. + * + * Returns 0 on success and places the extent mappings in mval. nmaps is used + * as an input/output parameter where the caller specifies the maximum number + * of mappings that may be returned and xfs_bmapi_write passes back the number + * of mappings (including existing mappings) it found. + * + * Returns a negative error code on failure, including -ENOSPC when it could not + * allocate any blocks and -ENOSR when it did allocate blocks to convert a + * delalloc range, but those blocks were before the passed in range. */ int xfs_bmapi_write( struct xfs_trans *tp, /* transaction pointer */ struct xfs_inode *ip, /* incore inode */ @@ -4419,14 +4430,20 @@ xfs_bmapi_write( bma.length = len;
ASSERT(len > 0); ASSERT(bma.length > 0); error = xfs_bmapi_allocate(&bma); - if (error) + if (error) { + /* + * If we already allocated space in a previous + * iteration return what we go so far when + * running out of space. + */ + if (error == -ENOSPC && bma.nallocs) + break; goto error0; - if (bma.blkno == NULLFSBLOCK) - break; + }
/* * If this is a CoW allocation, record the data in * the refcount btree for orphan recovery. */ @@ -4460,22 +4477,36 @@ xfs_bmapi_write( /* Else go on to the next record. */ bma.prev = bma.got; if (!xfs_iext_next_extent(ifp, &bma.icur, &bma.got)) eof = true; } - *nmap = n;
error = xfs_bmap_btree_to_extents(tp, ip, bma.cur, &bma.logflags, whichfork); if (error) goto error0;
ASSERT(ifp->if_format != XFS_DINODE_FMT_BTREE || ifp->if_nextents > XFS_IFORK_MAXEXT(ip, whichfork)); xfs_bmapi_finish(&bma, whichfork, 0); xfs_bmap_validate_ret(orig_bno, orig_len, orig_flags, orig_mval, - orig_nmap, *nmap); + orig_nmap, n); + + /* + * When converting delayed allocations, xfs_bmapi_allocate ignores + * the passed in bno and always converts from the start of the found + * delalloc extent. + * + * To avoid a successful return with *nmap set to 0, return the magic + * -ENOSR error code for this particular case so that the caller can + * handle it. + */ + if (!n) { + ASSERT(bma.nallocs >= *nmap); + return -ENOSR; + } + *nmap = n; return 0; error0: xfs_bmapi_finish(&bma, whichfork, error); return error; } @@ -4578,13 +4609,10 @@ xfs_bmapi_convert_delalloc(
error = xfs_bmapi_allocate(&bma); if (error) goto out_finish;
- error = -ENOSPC; - if (WARN_ON_ONCE(bma.blkno == NULLFSBLOCK)) - goto out_finish; error = -EFSCORRUPTED; if (WARN_ON_ONCE(!xfs_valid_startblock(ip, bma.got.br_startblock))) goto out_finish;
XFS_STATS_ADD(mp, xs_xstrat_bytes, XFS_FSB_TO_B(mp, bma.length)); diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index 282c7cf032f4..12e3cca804b7 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -2156,61 +2156,51 @@ xfs_da_grow_inode_int( { struct xfs_trans *tp = args->trans; struct xfs_inode *dp = args->dp; int w = args->whichfork; xfs_rfsblock_t nblks = dp->i_nblocks; - struct xfs_bmbt_irec map, *mapp; - int nmap, error, got, i, mapi; + struct xfs_bmbt_irec map, *mapp = ↦ + int nmap, error, got, i, mapi = 1;
/* * Find a spot in the file space to put the new block. */ error = xfs_bmap_first_unused(tp, dp, count, bno, w); if (error) return error;
/* * Try mapping it in one filesystem block. */ nmap = 1; error = xfs_bmapi_write(tp, dp, *bno, count, xfs_bmapi_aflag(w)|XFS_BMAPI_METADATA|XFS_BMAPI_CONTIG, args->total, &map, &nmap); - if (error) - return error; - - ASSERT(nmap <= 1); - if (nmap == 1) { - mapp = ↦ - mapi = 1; - } else if (nmap == 0 && count > 1) { + if (error == -ENOSPC && count > 1) { xfs_fileoff_t b; int c;
/* * If we didn't get it and the block might work if fragmented, * try without the CONTIG flag. Loop until we get it all. */ mapp = kmem_alloc(sizeof(*mapp) * count, 0); for (b = *bno, mapi = 0; b < *bno + count; ) { c = (int)(*bno + count - b); nmap = min(XFS_BMAP_MAX_NMAP, c); error = xfs_bmapi_write(tp, dp, b, c, xfs_bmapi_aflag(w)|XFS_BMAPI_METADATA, args->total, &mapp[mapi], &nmap); if (error) goto out_free_map; - if (nmap < 1) - break; mapi += nmap; b = mapp[mapi - 1].br_startoff + mapp[mapi - 1].br_blockcount; } - } else { - mapi = 0; - mapp = NULL; } + if (error) + goto out_free_map;
/* * Count the blocks we got, make sure it matches the total. */ for (i = 0, got = 0; i < mapi; i++) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 468bb61a5e46..399451aab26a 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -866,37 +866,36 @@ xfs_alloc_file_space( error = xfs_iext_count_upgrade(tp, ip, XFS_IEXT_ADD_NOSPLIT_CNT); if (error) goto error;
- error = xfs_bmapi_write(tp, ip, startoffset_fsb, - allocatesize_fsb, XFS_BMAPI_PREALLOC, 0, imapp, - &nimaps); - if (error) - goto error; - - ip->i_diflags |= XFS_DIFLAG_PREALLOC; - xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); - - error = xfs_trans_commit(tp); - xfs_iunlock(ip, XFS_ILOCK_EXCL); - if (error) - break; - /* * If the allocator cannot find a single free extent large * enough to cover the start block of the requested range, - * xfs_bmapi_write will return 0 but leave *nimaps set to 0. + * xfs_bmapi_write will return -ENOSR. * * In that case we simply need to keep looping with the same * startoffset_fsb so that one of the following allocations * will eventually reach the requested range. */ - if (nimaps) { + error = xfs_bmapi_write(tp, ip, startoffset_fsb, + allocatesize_fsb, XFS_BMAPI_PREALLOC, 0, imapp, + &nimaps); + if (error) { + if (error != -ENOSR) + goto error; + error = 0; + } else { startoffset_fsb += imapp->br_blockcount; allocatesize_fsb -= imapp->br_blockcount; } + + ip->i_diflags |= XFS_DIFLAG_PREALLOC; + xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); + + error = xfs_trans_commit(tp); + xfs_iunlock(ip, XFS_ILOCK_EXCL); }
return error;
error: diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index a8b2f3b278ea..6186b69be50a 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -331,11 +331,10 @@ xfs_dquot_disk_alloc( &nmaps); if (error) goto err_cancel;
ASSERT(map.br_blockcount == XFS_DQUOT_CLUSTER_SIZE_FSB); - ASSERT(nmaps == 1); ASSERT((map.br_startblock != DELAYSTARTBLOCK) && (map.br_startblock != HOLESTARTBLOCK));
/* * Keep track of the blkno to save a lookup later diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index ab5512c0bcf7..60d9638cec6d 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -307,18 +307,10 @@ xfs_iomap_write_direct( */ error = xfs_trans_commit(tp); if (error) goto out_unlock;
- /* - * Copy any maps to caller's array and return any error. - */ - if (nimaps == 0) { - error = -ENOSPC; - goto out_unlock; - } - if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock))) error = xfs_alert_fsblock_zero(ip, imap);
out_unlock: *seq = xfs_iomap_inode_sequence(ip, 0); diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 1bac6a8af970..0405226fb74c 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -429,17 +429,10 @@ xfs_reflink_fill_cow_hole( xfs_inode_set_cowblocks_tag(ip); error = xfs_trans_commit(tp); if (error) return error;
- /* - * Allocation succeeded but the requested range was not even partially - * satisfied? Bail out! - */ - if (nimaps == 0) - return -ENOSPC; - convert: return xfs_reflink_convert_unwritten(ip, imap, cmap, convert_now);
out_trans_cancel: xfs_trans_cancel(tp); @@ -498,17 +491,10 @@ xfs_reflink_fill_delalloc(
xfs_inode_set_cowblocks_tag(ip); error = xfs_trans_commit(tp); if (error) return error; - - /* - * Allocation succeeded but the requested range was not even - * partially satisfied? Bail out! - */ - if (nimaps == 0) - return -ENOSPC; } while (cmap->br_startoff + cmap->br_blockcount <= imap->br_startoff);
return xfs_reflink_convert_unwritten(ip, imap, cmap, convert_now);
out_trans_cancel: diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 7c5134899634..5cf1e91f4c20 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -838,12 +838,10 @@ xfs_growfs_rt_alloc( * Allocate blocks to the bitmap file. */ nmap = 1; error = xfs_bmapi_write(tp, ip, oblocks, nblocks - oblocks, XFS_BMAPI_METADATA, 0, &map, &nmap); - if (!error && nmap < 1) - error = -ENOSPC; if (error) goto out_trans_cancel; /* * Free any blocks freed up in the transaction, then commit. */
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 6773da870ab89123d1b513da63ed59e32a29cb77
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Christoph Hellwighch@lst.de
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: f43bd357fde0)
Note: The patch differs from the upstream commit: --- 1: 6773da870ab89 ! 1: 00b5c634c7b39 xfs: fix error returns from xfs_bmapi_write @@ Metadata ## Commit message ## xfs: fix error returns from xfs_bmapi_write
+ [ Upstream commit 6773da870ab89123d1b513da63ed59e32a29cb77 ] + xfs_bmapi_write can return 0 without actually returning a mapping in mval in two different cases:
@@ Commit message Reported-by: 刘通 lyutoon@gmail.com Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/libxfs/xfs_attr_remote.c ## @@ fs/xfs/libxfs/xfs_attr_remote.c: xfs_attr_rmtval_set_blk( @@ fs/xfs/libxfs/xfs_bmap.c: xfs_bmapi_write( return 0; error0: xfs_bmapi_finish(&bma, whichfork, error); -@@ fs/xfs/libxfs/xfs_bmap.c: xfs_bmapi_convert_one_delalloc( +@@ fs/xfs/libxfs/xfs_bmap.c: xfs_bmapi_convert_delalloc( if (error) goto out_finish;
- error = -ENOSPC; - if (WARN_ON_ONCE(bma.blkno == NULLFSBLOCK)) - goto out_finish; - if (WARN_ON_ONCE(!xfs_valid_startblock(ip, bma.got.br_startblock))) { - xfs_bmap_mark_sick(ip, whichfork); - error = -EFSCORRUPTED; + error = -EFSCORRUPTED; + if (WARN_ON_ONCE(!xfs_valid_startblock(ip, bma.got.br_startblock))) + goto out_finish;
## fs/xfs/libxfs/xfs_da_btree.c ## @@ fs/xfs/libxfs/xfs_da_btree.c: xfs_da_grow_inode_int( @@ fs/xfs/libxfs/xfs_da_btree.c: xfs_da_grow_inode_int( /* * Count the blocks we got, make sure it matches the total.
- ## fs/xfs/scrub/quota_repair.c ## -@@ fs/xfs/scrub/quota_repair.c: xrep_quota_item_fill_bmap_hole( - irec, &nmaps); - if (error) - return error; -- if (nmaps != 1) -- return -ENOSPC; - - dq->q_blkno = XFS_FSB_TO_DADDR(mp, irec->br_startblock); - -@@ fs/xfs/scrub/quota_repair.c: xrep_quota_data_fork( - XFS_BMAPI_CONVERT, 0, &nrec, &nmap); - if (error) - goto out; -- if (nmap != 1) { -- error = -ENOSPC; -- goto out; -- } - ASSERT(nrec.br_startoff == irec.br_startoff); - ASSERT(nrec.br_blockcount == irec.br_blockcount); - - - ## fs/xfs/scrub/rtbitmap_repair.c ## -@@ fs/xfs/scrub/rtbitmap_repair.c: xrep_rtbitmap_data_mappings( - 0, &map, &nmaps); - if (error) - return error; -- if (nmaps != 1) -- return -EFSCORRUPTED; - - /* Commit new extent and all deferred work. */ - error = xrep_defer_finish(sc); - ## fs/xfs/xfs_bmap_util.c ## @@ fs/xfs/xfs_bmap_util.c: xfs_alloc_file_space( if (error) @@ fs/xfs/xfs_iomap.c: xfs_iomap_write_direct( - goto out_unlock; - } - - if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock))) { - xfs_bmap_mark_sick(ip, XFS_DATA_FORK); + if (unlikely(!xfs_valid_startblock(ip, imap->br_startblock))) error = xfs_alert_fsblock_zero(ip, imap); +
## fs/xfs/xfs_reflink.c ## @@ fs/xfs/xfs_reflink.c: xfs_reflink_fill_cow_hole( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
From: Christoph Hellwig hch@lst.de
[ Upstream commit d69bee6a35d3c5e4873b9e164dd1a9711351a97c ]
xfs_bmap_add_extent_delay_real takes parts or all of a delalloc extent and converts them to a real extent. It is written to deal with any potential overlap of the to be converted range with the delalloc extent, but it turns out that currently only converting the entire extents, or a part starting at the beginning is actually exercised, as the only caller always tries to convert the entire delalloc extent, and either succeeds or at least progresses partially from the start.
If it only converts a tiny part of a delalloc extent, the indirect block calculation for the new delalloc extent (da_new) might be equivalent to that of the existing delalloc extent (da_old). If this extent conversion now requires allocating an indirect block that gets accounted into da_new, leading to the assert that da_new must be smaller or equal to da_new unless we split the extent to trigger.
Except for the assert that case is actually handled by just trying to allocate more space, as that already handled for the split case (which currently can't be reached at all), so just reusing it should be fine. Except that without dipping into the reserved block pool that would make it a bit too easy to trigger a fs shutdown due to ENOSPC. So in addition to adjusting the assert, also dip into the reserved block pool.
Note that I could only reproduce the assert with a change to only convert the actually asked range instead of the full delalloc extent from xfs_bmapi_write.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/libxfs/xfs_bmap.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 0235c1dd3d7e..f8a355e1196f 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -1528,10 +1528,11 @@ xfs_bmap_add_extent_delay_real( } error = xfs_bmbt_update(bma->cur, &LEFT); if (error) goto done; } + ASSERT(da_new <= da_old); break;
case BMAP_LEFT_FILLING | BMAP_RIGHT_FILLING | BMAP_LEFT_CONTIG: /* * Filling in all of a previously delayed allocation extent. @@ -1557,10 +1558,11 @@ xfs_bmap_add_extent_delay_real( } error = xfs_bmbt_update(bma->cur, &LEFT); if (error) goto done; } + ASSERT(da_new <= da_old); break;
case BMAP_LEFT_FILLING | BMAP_RIGHT_FILLING | BMAP_RIGHT_CONTIG: /* * Filling in all of a previously delayed allocation extent. @@ -1590,10 +1592,11 @@ xfs_bmap_add_extent_delay_real( } error = xfs_bmbt_update(bma->cur, &PREV); if (error) goto done; } + ASSERT(da_new <= da_old); break;
case BMAP_LEFT_FILLING | BMAP_RIGHT_FILLING: /* * Filling in all of a previously delayed allocation extent. @@ -1622,10 +1625,11 @@ xfs_bmap_add_extent_delay_real( if (XFS_IS_CORRUPT(mp, i != 1)) { error = -EFSCORRUPTED; goto done; } } + ASSERT(da_new <= da_old); break;
case BMAP_LEFT_FILLING | BMAP_LEFT_CONTIG: /* * Filling in the first part of a previous delayed allocation. @@ -1659,10 +1663,11 @@ xfs_bmap_add_extent_delay_real( } error = xfs_bmbt_update(bma->cur, &LEFT); if (error) goto done; } + ASSERT(da_new <= da_old); break;
case BMAP_LEFT_FILLING: /* * Filling in the first part of a previous delayed allocation. @@ -1746,10 +1751,11 @@ xfs_bmap_add_extent_delay_real( PREV.br_startblock = nullstartblock(da_new);
xfs_iext_update_extent(bma->ip, state, &bma->icur, &PREV); xfs_iext_next(ifp, &bma->icur); xfs_iext_update_extent(bma->ip, state, &bma->icur, &RIGHT); + ASSERT(da_new <= da_old); break;
case BMAP_RIGHT_FILLING: /* * Filling in the last part of a previous delayed allocation. @@ -1793,10 +1799,11 @@ xfs_bmap_add_extent_delay_real(
PREV.br_startblock = nullstartblock(da_new); PREV.br_blockcount = temp; xfs_iext_insert(bma->ip, &bma->icur, &PREV, state); xfs_iext_next(ifp, &bma->icur); + ASSERT(da_new <= da_old); break;
case 0: /* * Filling in the middle part of a previous delayed allocation. @@ -1913,15 +1920,13 @@ xfs_bmap_add_extent_delay_real( da_new += bma->cur->bc_ino.allocated; bma->cur->bc_ino.allocated = 0; }
/* adjust for changes in reserved delayed indirect blocks */ - if (da_new != da_old) { - ASSERT(state == 0 || da_new < da_old); + if (da_new != da_old) error = xfs_mod_fdblocks(mp, (int64_t)(da_old - da_new), - false); - } + true);
xfs_bmap_check_leaf_extents(bma->cur, bma->ip, whichfork); done: if (whichfork != XFS_COW_FORK) bma->logflags |= rval;
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: d69bee6a35d3c5e4873b9e164dd1a9711351a97c
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Christoph Hellwighch@lst.de
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 4bcef72d96b5)
Note: The patch differs from the upstream commit: --- 1: d69bee6a35d3c ! 1: 2cb51f69d0955 xfs: fix xfs_bmap_add_extent_delay_real for partial conversions @@ Metadata ## Commit message ## xfs: fix xfs_bmap_add_extent_delay_real for partial conversions
+ [ Upstream commit d69bee6a35d3c5e4873b9e164dd1a9711351a97c ] + xfs_bmap_add_extent_delay_real takes parts or all of a delalloc extent and converts them to a real extent. It is written to deal with any potential overlap of the to be converted range with the delalloc extent, @@ Commit message Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/libxfs/xfs_bmap.c ## @@ fs/xfs/libxfs/xfs_bmap.c: xfs_bmap_add_extent_delay_real( @@ fs/xfs/libxfs/xfs_bmap.c: xfs_bmap_add_extent_delay_real( }
/* adjust for changes in reserved delayed indirect blocks */ -- if (da_new < da_old) { -+ if (da_new < da_old) - xfs_add_fdblocks(mp, da_old - da_new); -- } else if (da_new > da_old) { -- ASSERT(state == 0); -- error = xfs_dec_fdblocks(mp, da_new - da_old, false); +- if (da_new != da_old) { +- ASSERT(state == 0 || da_new < da_old); ++ if (da_new != da_old) + error = xfs_mod_fdblocks(mp, (int64_t)(da_old - da_new), +- false); - } -+ else if (da_new > da_old) -+ error = xfs_dec_fdblocks(mp, da_new - da_old, true); ++ true);
xfs_bmap_check_leaf_extents(bma->cur, bma->ip, whichfork); done: ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: Christoph Hellwig hch@lst.de
[ Upstream commit 86de848403abda05bf9c16dcdb6bef65a8d88c41 ]
Accessing if_bytes without the ilock is racy. Remove the initial if_bytes == 0 check in xfs_reflink_end_cow_extent and let ext_iext_lookup_extent fail for this case after we've taken the ilock.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/xfs_reflink.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 0405226fb74c..d539487eaf1a 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -716,16 +716,10 @@ xfs_reflink_end_cow_extent( struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_COW_FORK); unsigned int resblks; int nmaps; int error;
- /* No COW extents? That's easy! */ - if (ifp->if_bytes == 0) { - *offset_fsb = end_fsb; - return 0; - } - resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK); error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, XFS_TRANS_RESERVE, &tp); if (error) return error;
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 86de848403abda05bf9c16dcdb6bef65a8d88c41
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Christoph Hellwighch@lst.de
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: c299188b443a)
Note: The patch differs from the upstream commit: --- 1: 86de848403abd ! 1: 17fd2403176bf xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent @@ Metadata ## Commit message ## xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent
+ [ Upstream commit 86de848403abda05bf9c16dcdb6bef65a8d88c41 ] + Accessing if_bytes without the ilock is racy. Remove the initial if_bytes == 0 check in xfs_reflink_end_cow_extent and let ext_iext_lookup_extent fail for this case after we've taken the ilock. @@ Commit message Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/xfs_reflink.c ## @@ fs/xfs/xfs_reflink.c: xfs_reflink_end_cow_extent( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 8ef1d96a985e4dc07ffbd71bd7fc5604a80cc644 ]
The XFS_SB_FEAT_INCOMPAT_LOG_XATTRS feature bit protects a filesystem from old kernels that do not know how to recover extended attribute log intent items. Make this check mandatory instead of a debugging assert.
Fixes: fd920008784ea ("xfs: Set up infrastructure for log attribute replay") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/xfs_attr_item.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c index 11e88a76a33c..c5bc6b72e014 100644 --- a/fs/xfs/xfs_attr_item.c +++ b/fs/xfs/xfs_attr_item.c @@ -508,10 +508,13 @@ xfs_attri_validate( struct xfs_attri_log_format *attrp) { unsigned int op = attrp->alfi_op_flags & XFS_ATTRI_OP_FLAGS_TYPE_MASK;
+ if (!xfs_sb_version_haslogxattrs(&mp->m_sb)) + return false; + if (attrp->__pad != 0) return false;
if (attrp->alfi_op_flags & ~XFS_ATTRI_OP_FLAGS_TYPE_MASK) return false; @@ -599,12 +602,10 @@ xfs_attri_item_recover( args->hashval = xfs_da_hashname(args->name, args->namelen); args->attr_filter = attrp->alfi_attr_filter & XFS_ATTRI_FILTER_MASK; args->op_flags = XFS_DA_OP_RECOVERY | XFS_DA_OP_OKNOENT | XFS_DA_OP_LOGGED;
- ASSERT(xfs_sb_version_haslogxattrs(&mp->m_sb)); - switch (attr->xattri_op_flags) { case XFS_ATTRI_OP_FLAGS_SET: case XFS_ATTRI_OP_FLAGS_REPLACE: args->value = nv->value.i_addr; args->valuelen = nv->value.i_len;
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 8ef1d96a985e4dc07ffbd71bd7fc5604a80cc644
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: c13c21f77824)
Note: The patch differs from the upstream commit: --- 1: 8ef1d96a985e4 ! 1: e96849ae4ddac xfs: require XFS_SB_FEAT_INCOMPAT_LOG_XATTRS for attr log intent item recovery @@ Metadata ## Commit message ## xfs: require XFS_SB_FEAT_INCOMPAT_LOG_XATTRS for attr log intent item recovery
+ [ Upstream commit 8ef1d96a985e4dc07ffbd71bd7fc5604a80cc644 ] + The XFS_SB_FEAT_INCOMPAT_LOG_XATTRS feature bit protects a filesystem from old kernels that do not know how to recover extended attribute log intent items. Make this check mandatory instead of a debugging assert. @@ Commit message Fixes: fd920008784ea ("xfs: Set up infrastructure for log attribute replay") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/xfs_attr_item.c ## @@ fs/xfs/xfs_attr_item.c: xfs_attri_validate( @@ fs/xfs/xfs_attr_item.c: xfs_attri_validate( if (attrp->__pad != 0) return false;
-@@ fs/xfs/xfs_attr_item.c: xfs_attri_recover_work( +@@ fs/xfs/xfs_attr_item.c: xfs_attri_item_recover( + args->op_flags = XFS_DA_OP_RECOVERY | XFS_DA_OP_OKNOENT | XFS_DA_OP_LOGGED; - args->owner = args->dp->i_ino;
- ASSERT(xfs_sb_version_haslogxattrs(&mp->m_sb)); - ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit ad206ae50eca62836c5460ab5bbf2a6c59a268e7 ]
Check that the number of recovered log iovecs is what is expected for the xattri opcode is expecting.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/xfs_attr_item.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c index c5bc6b72e014..a8e09ea2622d 100644 --- a/fs/xfs/xfs_attr_item.c +++ b/fs/xfs/xfs_attr_item.c @@ -715,28 +715,55 @@ xlog_recover_attri_commit_pass2( struct xfs_attri_log_format *attri_formatp; struct xfs_attri_log_nameval *nv; const void *attr_value = NULL; const void *attr_name; size_t len; + unsigned int op;
attri_formatp = item->ri_buf[0].i_addr; attr_name = item->ri_buf[1].i_addr;
/* Validate xfs_attri_log_format before the large memory allocation */ len = sizeof(struct xfs_attri_log_format); if (item->ri_buf[0].i_len != len) { XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, item->ri_buf[0].i_addr, item->ri_buf[0].i_len); return -EFSCORRUPTED; }
if (!xfs_attri_validate(mp, attri_formatp)) { XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, item->ri_buf[0].i_addr, item->ri_buf[0].i_len); return -EFSCORRUPTED; }
+ /* Check the number of log iovecs makes sense for the op code. */ + op = attri_formatp->alfi_op_flags & XFS_ATTRI_OP_FLAGS_TYPE_MASK; + switch (op) { + case XFS_ATTRI_OP_FLAGS_SET: + case XFS_ATTRI_OP_FLAGS_REPLACE: + /* Log item, attr name, attr value */ + if (item->ri_total != 3) { + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, + attri_formatp, len); + return -EFSCORRUPTED; + } + break; + case XFS_ATTRI_OP_FLAGS_REMOVE: + /* Log item, attr name */ + if (item->ri_total != 2) { + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, + attri_formatp, len); + return -EFSCORRUPTED; + } + break; + default: + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, + attri_formatp, len); + return -EFSCORRUPTED; + } + /* Validate the attr name */ if (item->ri_buf[1].i_len != xlog_calc_iovec_len(attri_formatp->alfi_name_len)) { XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, item->ri_buf[0].i_addr, item->ri_buf[0].i_len);
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: ad206ae50eca62836c5460ab5bbf2a6c59a268e7
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 0934046e3392)
Note: The patch differs from the upstream commit: --- 1: ad206ae50eca6 ! 1: 6bab668a662e3 xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2 @@ Metadata ## Commit message ## xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2
+ [ Upstream commit ad206ae50eca62836c5460ab5bbf2a6c59a268e7 ] + Check that the number of recovered log iovecs is what is expected for the xattri opcode is expecting.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/xfs_attr_item.c ## @@ fs/xfs/xfs_attr_item.c: xlog_recover_attri_commit_pass2( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 1c7f09d210aba2f2bb206e2e8c97c9f11a3fd880 ]
Strengthen the xattri log item recovery code by checking that we actually have the required name and newname buffers for whatever operation we're replaying.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/xfs_attr_item.c | 58 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 47 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c index a8e09ea2622d..4a712f1565c1 100644 --- a/fs/xfs/xfs_attr_item.c +++ b/fs/xfs/xfs_attr_item.c @@ -715,26 +715,24 @@ xlog_recover_attri_commit_pass2( struct xfs_attri_log_format *attri_formatp; struct xfs_attri_log_nameval *nv; const void *attr_value = NULL; const void *attr_name; size_t len; - unsigned int op; - - attri_formatp = item->ri_buf[0].i_addr; - attr_name = item->ri_buf[1].i_addr; + unsigned int op, i = 0;
/* Validate xfs_attri_log_format before the large memory allocation */ len = sizeof(struct xfs_attri_log_format); - if (item->ri_buf[0].i_len != len) { + if (item->ri_buf[i].i_len != len) { XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, item->ri_buf[0].i_addr, item->ri_buf[0].i_len); return -EFSCORRUPTED; }
+ attri_formatp = item->ri_buf[i].i_addr; if (!xfs_attri_validate(mp, attri_formatp)) { XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, - item->ri_buf[0].i_addr, item->ri_buf[0].i_len); + attri_formatp, len); return -EFSCORRUPTED; }
/* Check the number of log iovecs makes sense for the op code. */ op = attri_formatp->alfi_op_flags & XFS_ATTRI_OP_FLAGS_TYPE_MASK; @@ -759,35 +757,73 @@ xlog_recover_attri_commit_pass2( default: XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, attri_formatp, len); return -EFSCORRUPTED; } + i++;
/* Validate the attr name */ - if (item->ri_buf[1].i_len != + if (item->ri_buf[i].i_len != xlog_calc_iovec_len(attri_formatp->alfi_name_len)) { XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, - item->ri_buf[0].i_addr, item->ri_buf[0].i_len); + attri_formatp, len); return -EFSCORRUPTED; }
+ attr_name = item->ri_buf[i].i_addr; if (!xfs_attr_namecheck(attr_name, attri_formatp->alfi_name_len)) { XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, - item->ri_buf[1].i_addr, item->ri_buf[1].i_len); + attri_formatp, len); return -EFSCORRUPTED; } + i++;
/* Validate the attr value, if present */ if (attri_formatp->alfi_value_len != 0) { - if (item->ri_buf[2].i_len != xlog_calc_iovec_len(attri_formatp->alfi_value_len)) { + if (item->ri_buf[i].i_len != xlog_calc_iovec_len(attri_formatp->alfi_value_len)) { XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, item->ri_buf[0].i_addr, item->ri_buf[0].i_len); return -EFSCORRUPTED; }
- attr_value = item->ri_buf[2].i_addr; + attr_value = item->ri_buf[i].i_addr; + i++; + } + + /* + * Make sure we got the correct number of buffers for the operation + * that we just loaded. + */ + if (i != item->ri_total) { + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, + attri_formatp, len); + return -EFSCORRUPTED; + } + + switch (op) { + case XFS_ATTRI_OP_FLAGS_REMOVE: + /* Regular remove operations operate only on names. */ + if (attr_value != NULL || attri_formatp->alfi_value_len != 0) { + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, + attri_formatp, len); + return -EFSCORRUPTED; + } + fallthrough; + case XFS_ATTRI_OP_FLAGS_SET: + case XFS_ATTRI_OP_FLAGS_REPLACE: + /* + * Regular xattr set/remove/replace operations require a name + * and do not take a newname. Values are optional for set and + * replace. + */ + if (attr_name == NULL || attri_formatp->alfi_name_len == 0) { + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, + attri_formatp, len); + return -EFSCORRUPTED; + } + break; }
/* * Memory alloc failure will cause replay to abort. We attach the * name/value buffer to the recovered incore log item and drop our
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 1c7f09d210aba2f2bb206e2e8c97c9f11a3fd880
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 9716cdcc2f9e)
Note: The patch differs from the upstream commit: --- 1: 1c7f09d210aba ! 1: 2ed17884c4e4f xfs: validate recovered name buffers when recovering xattr items @@ Metadata ## Commit message ## xfs: validate recovered name buffers when recovering xattr items
+ [ Upstream commit 1c7f09d210aba2f2bb206e2e8c97c9f11a3fd880 ] + Strengthen the xattri log item recovery code by checking that we actually have the required name and newname buffers for whatever operation we're replaying.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/xfs_attr_item.c ## @@ fs/xfs/xfs_attr_item.c: xlog_recover_attri_commit_pass2( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 2a009397eb5ae178670cbd7101e9635cf6412b35 ]
In my haste to fix what I thought was a performance problem in the attr scrub code, I neglected to notice that the xfs_attr_get_ilocked also had the effect of checking that attributes can actually be looked up through the attr dabtree. Fix this.
Fixes: 44af6c7e59b12 ("xfs: don't load local xattr values during scrub") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/scrub/attr.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c index b6f0c9f3f124..f51771e5c3fe 100644 --- a/fs/xfs/scrub/attr.c +++ b/fs/xfs/scrub/attr.c @@ -157,10 +157,15 @@ xchk_xattr_listent( args.hashval = xfs_da_hashname(args.name, args.namelen); args.trans = context->tp; args.value = xchk_xattr_valuebuf(sx->sc); args.valuelen = valuelen;
+ /* + * Get the attr value to ensure that lookup can find this attribute + * through the dabtree indexing and that remote value retrieval also + * works correctly. + */ error = xfs_attr_get_ilocked(&args); /* ENODATA means the hash lookup failed and the attr is bad */ if (error == -ENODATA) error = -EFSCORRUPTED; if (!xchk_fblock_process_error(sx->sc, XFS_ATTR_FORK, args.blkno,
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 2a009397eb5ae178670cbd7101e9635cf6412b35
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 20adb1e2f069)
Note: The patch differs from the upstream commit: --- 1: 2a009397eb5ae < -: ------------- xfs: revert commit 44af6c7e59b12 -: ------------- > 1: 4bf3fa8a1cd5e xfs: revert commit 44af6c7e59b12 ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit bb712842a85d595525e72f0e378c143e620b3ea2 ]
Commit 1aa91d9c9933 ("xfs: Add async buffered write support") replace xfs_ilock(XFS_ILOCK_EXCL) with xfs_ilock_for_iomap() when locking the writing inode, and a new variable lockmode is used to indicate the lock mode. Although the lockmode should always be XFS_ILOCK_EXCL, it's still better to use this variable instead of useing XFS_ILOCK_EXCL directly when unlocking the inode.
Fixes: 1aa91d9c9933 ("xfs: Add async buffered write support") Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/xfs_iomap.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 60d9638cec6d..f6ca27a42498 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -1127,37 +1127,37 @@ xfs_buffered_write_iomap_begin( /* * Flag newly allocated delalloc blocks with IOMAP_F_NEW so we punch * them out if the write happens to fail. */ seq = xfs_iomap_inode_sequence(ip, IOMAP_F_NEW); - xfs_iunlock(ip, XFS_ILOCK_EXCL); + xfs_iunlock(ip, lockmode); trace_xfs_iomap_alloc(ip, offset, count, allocfork, &imap); return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, IOMAP_F_NEW, seq);
found_imap: seq = xfs_iomap_inode_sequence(ip, 0); - xfs_iunlock(ip, XFS_ILOCK_EXCL); + xfs_iunlock(ip, lockmode); return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0, seq);
found_cow: seq = xfs_iomap_inode_sequence(ip, 0); if (imap.br_startoff <= offset_fsb) { error = xfs_bmbt_to_iomap(ip, srcmap, &imap, flags, 0, seq); if (error) goto out_unlock; seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED); - xfs_iunlock(ip, XFS_ILOCK_EXCL); + xfs_iunlock(ip, lockmode); return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, IOMAP_F_SHARED, seq); }
xfs_trim_extent(&cmap, offset_fsb, imap.br_startoff - offset_fsb); - xfs_iunlock(ip, XFS_ILOCK_EXCL); + xfs_iunlock(ip, lockmode); return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, 0, seq);
out_unlock: - xfs_iunlock(ip, XFS_ILOCK_EXCL); + xfs_iunlock(ip, lockmode); return error; }
static int xfs_buffered_write_delalloc_punch(
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: bb712842a85d595525e72f0e378c143e620b3ea2
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Zhang Yiyi.zhang@huawei.com
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: f24ba2183148)
Note: The patch differs from the upstream commit: --- 1: bb712842a85d5 ! 1: 359664e60e303 xfs: match lock mode in xfs_buffered_write_iomap_begin() @@ Metadata ## Commit message ## xfs: match lock mode in xfs_buffered_write_iomap_begin()
+ [ Upstream commit bb712842a85d595525e72f0e378c143e620b3ea2 ] + Commit 1aa91d9c9933 ("xfs: Add async buffered write support") replace xfs_ilock(XFS_ILOCK_EXCL) with xfs_ilock_for_iomap() when locking the writing inode, and a new variable lockmode is used to indicate the lock @@ Commit message Reviewed-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/xfs_iomap.c ## @@ fs/xfs/xfs_iomap.c: xfs_buffered_write_iomap_begin( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit fc8d0ba0ff5fe4700fa02008b7751ec6b84b7677 ]
Allow callers to pass a NULLL seq argument if they don't care about the fork sequence number.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/libxfs/xfs_bmap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index f8a355e1196f..92d321dd8944 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4578,11 +4578,12 @@ xfs_bmapi_convert_delalloc( * the extent. Just return the real extent at this offset. */ if (!isnullstartblock(bma.got.br_startblock)) { xfs_bmbt_to_iomap(ip, iomap, &bma.got, 0, flags, xfs_iomap_inode_sequence(ip, flags)); - *seq = READ_ONCE(ifp->if_seq); + if (seq) + *seq = READ_ONCE(ifp->if_seq); goto out_trans_cancel; }
bma.tp = tp; bma.ip = ip; @@ -4624,11 +4625,12 @@ xfs_bmapi_convert_delalloc( XFS_STATS_INC(mp, xs_xstrat_quick);
ASSERT(!isnullstartblock(bma.got.br_startblock)); xfs_bmbt_to_iomap(ip, iomap, &bma.got, 0, flags, xfs_iomap_inode_sequence(ip, flags)); - *seq = READ_ONCE(ifp->if_seq); + if (seq) + *seq = READ_ONCE(ifp->if_seq);
if (whichfork == XFS_COW_FORK) xfs_refcount_alloc_cow_extent(tp, bma.blkno, bma.length);
error = xfs_bmap_btree_to_extents(tp, ip, bma.cur, &bma.logflags,
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: fc8d0ba0ff5fe4700fa02008b7751ec6b84b7677
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Zhang Yiyi.zhang@huawei.com
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 0f726c17dfd8)
Note: The patch differs from the upstream commit: --- 1: fc8d0ba0ff5fe ! 1: 66fa71c76709d xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional @@ Metadata ## Commit message ## xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional
+ [ Upstream commit fc8d0ba0ff5fe4700fa02008b7751ec6b84b7677 ] + Allow callers to pass a NULLL seq argument if they don't care about the fork sequence number.
@@ Commit message Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/libxfs/xfs_bmap.c ## @@ fs/xfs/libxfs/xfs_bmap.c: xfs_bmapi_convert_delalloc( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit 2e08371a83f1c06fd85eea8cd37c87a224cc4cc4 ]
Since xfs_bmapi_convert_delalloc() only attempts to allocate the entire delalloc extent and require multiple invocations to allocate the target offset. So xfs_convert_blocks() add a loop to do this job and we call it in the write back path, but xfs_convert_blocks() isn't a common helper. Let's do it in xfs_bmapi_convert_delalloc() and drop xfs_convert_blocks(), preparing for the post EOF delalloc blocks converting in the buffered write begin path.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/libxfs/xfs_bmap.c | 34 +++++++++++++++++++++++-- fs/xfs/xfs_aops.c | 54 +++++++++++----------------------------- 2 files changed, 46 insertions(+), 42 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 92d321dd8944..a3c4d4a442af 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4520,12 +4520,12 @@ xfs_bmapi_write( * Convert an existing delalloc extent to real blocks based on file offset. This * attempts to allocate the entire delalloc extent and may require multiple * invocations to allocate the target offset if a large enough physical extent * is not available. */ -int -xfs_bmapi_convert_delalloc( +static int +xfs_bmapi_convert_one_delalloc( struct xfs_inode *ip, int whichfork, xfs_off_t offset, struct iomap *iomap, unsigned int *seq) @@ -4649,10 +4649,40 @@ xfs_bmapi_convert_delalloc( xfs_trans_cancel(tp); xfs_iunlock(ip, XFS_ILOCK_EXCL); return error; }
+/* + * Pass in a dellalloc extent and convert it to real extents, return the real + * extent that maps offset_fsb in iomap. + */ +int +xfs_bmapi_convert_delalloc( + struct xfs_inode *ip, + int whichfork, + loff_t offset, + struct iomap *iomap, + unsigned int *seq) +{ + int error; + + /* + * Attempt to allocate whatever delalloc extent currently backs offset + * and put the result into iomap. Allocate in a loop because it may + * take several attempts to allocate real blocks for a contiguous + * delalloc extent if free space is sufficiently fragmented. + */ + do { + error = xfs_bmapi_convert_one_delalloc(ip, whichfork, offset, + iomap, seq); + if (error) + return error; + } while (iomap->offset + iomap->length <= offset); + + return 0; +} + int xfs_bmapi_remap( struct xfs_trans *tp, struct xfs_inode *ip, xfs_fileoff_t bno, diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 21c241e96d48..50a7f2745514 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -223,66 +223,28 @@ xfs_imap_valid( XFS_WPC(wpc)->cow_seq != READ_ONCE(ip->i_cowfp->if_seq)) return false; return true; }
-/* - * Pass in a dellalloc extent and convert it to real extents, return the real - * extent that maps offset_fsb in wpc->iomap. - * - * The current page is held locked so nothing could have removed the block - * backing offset_fsb, although it could have moved from the COW to the data - * fork by another thread. - */ -static int -xfs_convert_blocks( - struct iomap_writepage_ctx *wpc, - struct xfs_inode *ip, - int whichfork, - loff_t offset) -{ - int error; - unsigned *seq; - - if (whichfork == XFS_COW_FORK) - seq = &XFS_WPC(wpc)->cow_seq; - else - seq = &XFS_WPC(wpc)->data_seq; - - /* - * Attempt to allocate whatever delalloc extent currently backs offset - * and put the result into wpc->iomap. Allocate in a loop because it - * may take several attempts to allocate real blocks for a contiguous - * delalloc extent if free space is sufficiently fragmented. - */ - do { - error = xfs_bmapi_convert_delalloc(ip, whichfork, offset, - &wpc->iomap, seq); - if (error) - return error; - } while (wpc->iomap.offset + wpc->iomap.length <= offset); - - return 0; -} - static int xfs_map_blocks( struct iomap_writepage_ctx *wpc, struct inode *inode, loff_t offset) { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; ssize_t count = i_blocksize(inode); xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset); xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + count); xfs_fileoff_t cow_fsb; int whichfork; struct xfs_bmbt_irec imap; struct xfs_iext_cursor icur; int retries = 0; int error = 0; + unsigned int *seq;
if (xfs_is_shutdown(mp)) return -EIO;
/* @@ -374,11 +336,23 @@ xfs_map_blocks(
xfs_bmbt_to_iomap(ip, &wpc->iomap, &imap, 0, 0, XFS_WPC(wpc)->data_seq); trace_xfs_map_blocks_found(ip, offset, count, whichfork, &imap); return 0; allocate_blocks: - error = xfs_convert_blocks(wpc, ip, whichfork, offset); + /* + * Convert a dellalloc extent to a real one. The current page is held + * locked so nothing could have removed the block backing offset_fsb, + * although it could have moved from the COW to the data fork by another + * thread. + */ + if (whichfork == XFS_COW_FORK) + seq = &XFS_WPC(wpc)->cow_seq; + else + seq = &XFS_WPC(wpc)->data_seq; + + error = xfs_bmapi_convert_delalloc(ip, whichfork, offset, + &wpc->iomap, seq); if (error) { /* * If we failed to find the extent in the COW fork we might have * raced with a COW to data fork conversion or truncate. * Restart the lookup to catch the extent in the data fork for
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 2e08371a83f1c06fd85eea8cd37c87a224cc4cc4
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Zhang Yiyi.zhang@huawei.com
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 36081fd0ee37)
Note: The patch differs from the upstream commit: --- 1: 2e08371a83f1c ! 1: 1ba14c6680bca xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset @@ Metadata ## Commit message ## xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset
+ [ Upstream commit 2e08371a83f1c06fd85eea8cd37c87a224cc4cc4 ] + Since xfs_bmapi_convert_delalloc() only attempts to allocate the entire delalloc extent and require multiple invocations to allocate the target offset. So xfs_convert_blocks() add a loop to do this job and we call it @@ Commit message Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/libxfs/xfs_bmap.c ## @@ fs/xfs/libxfs/xfs_bmap.c: xfs_bmapi_write( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit 5ce5674187c345dc31534d2024c09ad8ef29b7ba ]
Current clone operation could be non-atomic if the destination of a file is beyond EOF, user could get a file with corrupted (zeroed) data on crash.
The problem is about preallocations. If you write some data into a file:
[A...B)
and XFS decides to preallocate some post-eof blocks, then it can create a delayed allocation reservation:
[A.........D)
The writeback path tries to convert delayed extents to real ones by allocating blocks. If there aren't enough contiguous free space, we can end up with two extents, the first real and the second still delalloc:
[A....C)[C.D)
After that, both the in-memory and the on-disk file sizes are still B. If we clone into the range [E...F) from another file:
[A....C)[C.D) [E...F)
then xfs_reflink_zero_posteof() calls iomap_zero_range() to zero out the range [B, E) beyond EOF and flush it. Since [C, D) is still a delalloc extent, its pagecache will be zeroed and both the in-memory and on-disk size will be updated to D after flushing but before cloning. This is wrong, because the user can see the size change and read the zeroes while the clone operation is ongoing.
We need to keep the in-memory and on-disk size before the clone operation starts, so instead of writing zeroes through the page cache for delayed ranges beyond EOF, we convert these ranges to unwritten and invalidate any cached data over that range beyond EOF.
Suggested-by: Dave Chinner david@fromorbit.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/xfs_iomap.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index f6ca27a42498..fab191a09442 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -994,10 +994,28 @@ xfs_buffered_write_iomap_begin( if ((flags & IOMAP_ZERO) && imap.br_startoff > offset_fsb) { xfs_hole_to_iomap(ip, iomap, offset_fsb, imap.br_startoff); goto out_unlock; }
+ /* + * For zeroing, trim a delalloc extent that extends beyond the EOF + * block. If it starts beyond the EOF block, convert it to an + * unwritten extent. + */ + if ((flags & IOMAP_ZERO) && imap.br_startoff <= offset_fsb && + isnullstartblock(imap.br_startblock)) { + xfs_fileoff_t eof_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip)); + + if (offset_fsb >= eof_fsb) + goto convert_delay; + if (end_fsb > eof_fsb) { + end_fsb = eof_fsb; + xfs_trim_extent(&imap, offset_fsb, + end_fsb - offset_fsb); + } + } + /* * Search the COW fork extent list even if we did not find a data fork * extent. This serves two purposes: first this implements the * speculative preallocation using cowextsize, so that we also unshare * block adjacent to shared blocks instead of just the shared blocks @@ -1136,10 +1154,21 @@ xfs_buffered_write_iomap_begin( found_imap: seq = xfs_iomap_inode_sequence(ip, 0); xfs_iunlock(ip, lockmode); return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0, seq);
+convert_delay: + xfs_iunlock(ip, lockmode); + truncate_pagecache(inode, offset); + error = xfs_bmapi_convert_delalloc(ip, XFS_DATA_FORK, offset, + iomap, NULL); + if (error) + return error; + + trace_xfs_iomap_alloc(ip, offset, count, XFS_DATA_FORK, &imap); + return 0; + found_cow: seq = xfs_iomap_inode_sequence(ip, 0); if (imap.br_startoff <= offset_fsb) { error = xfs_bmbt_to_iomap(ip, srcmap, &imap, flags, 0, seq); if (error)
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 5ce5674187c345dc31534d2024c09ad8ef29b7ba
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Zhang Yiyi.zhang@huawei.com
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 4c99f3026cf2)
Note: The patch differs from the upstream commit: --- 1: 5ce5674187c34 ! 1: 05bb9e0eda29d xfs: convert delayed extents to unwritten when zeroing post eof blocks @@ Metadata ## Commit message ## xfs: convert delayed extents to unwritten when zeroing post eof blocks
+ [ Upstream commit 5ce5674187c345dc31534d2024c09ad8ef29b7ba ] + Current clone operation could be non-atomic if the destination of a file is beyond EOF, user could get a file with corrupted (zeroed) data on crash. @@ Commit message Reviewed-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/xfs_iomap.c ## @@ fs/xfs/xfs_iomap.c: xfs_buffered_write_iomap_begin( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 38de567906d95c397d87f292b892686b7ec6fbc3 ]
An internal user complained about log recovery failing on a symlink ("Bad dinode after recovery") with the following (excerpted) format:
core.magic = 0x494e core.mode = 0120777 core.version = 3 core.format = 2 (extents) core.nlinkv2 = 1 core.nextents = 1 core.size = 297 core.nblocks = 1 core.naextents = 0 core.forkoff = 0 core.aformat = 2 (extents) u3.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,12,1,0]
This is a symbolic link with a 297-byte target stored in a disk block, which is to say this is a symlink with a remote target. The forkoff is 0, which is to say that there's 512 - 176 == 336 bytes in the inode core to store the data fork.
Eventually, testing of generic/388 failed with the same inode corruption message during inode recovery. In writing a debugging patch to call xfs_dinode_verify on dirty inode log items when we're committing transactions, I observed that xfs/298 can reproduce the problem quite quickly.
xfs/298 creates a symbolic link, adds some extended attributes, then deletes them all. The test failure occurs when the final removexattr also deletes the attr fork because that does not convert the remote symlink back into a shortform symlink. That is how we trip this test. The only reason why xfs/298 only triggers with the debug patch added is that it deletes the symlink, so the final iflush shows the inode as free.
I wrote a quick fstest to emulate the behavior of xfs/298, except that it leaves the symlinks on the filesystem after inducing the "corrupt" state. Kernels going back at least as far as 4.18 have written out symlink inodes in this manner and prior to 1eb70f54c445f they did not object to reading them back in.
Because we've been writing out inodes this way for quite some time, the only way to fix this is to relax the check for symbolic links. Directories don't have this problem because di_size is bumped to blocksize during the sf->data conversion.
Fixes: 1eb70f54c445f ("xfs: validate inode fork size against fork format") Signed-off-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/libxfs/xfs_inode_buf.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 601b05ca5fc2..127c9a698d9f 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -363,21 +363,41 @@ xfs_dinode_verify_fork( di_nextents = xfs_dfork_nextents(dip, whichfork);
/* * For fork types that can contain local data, check that the fork * format matches the size of local data contained within the fork. - * - * For all types, check that when the size says the should be in extent - * or btree format, the inode isn't claiming it is in local format. */ if (whichfork == XFS_DATA_FORK) { - if (S_ISDIR(mode) || S_ISLNK(mode)) { + /* + * A directory small enough to fit in the inode must be stored + * in local format. The directory sf <-> extents conversion + * code updates the directory size accordingly. + */ + if (S_ISDIR(mode)) { + if (be64_to_cpu(dip->di_size) <= fork_size && + fork_format != XFS_DINODE_FMT_LOCAL) + return __this_address; + } + + /* + * A symlink with a target small enough to fit in the inode can + * be stored in extents format if xattrs were added (thus + * converting the data fork from shortform to remote format) + * and then removed. + */ + if (S_ISLNK(mode)) { if (be64_to_cpu(dip->di_size) <= fork_size && + fork_format != XFS_DINODE_FMT_EXTENTS && fork_format != XFS_DINODE_FMT_LOCAL) return __this_address; }
+ /* + * For all types, check that when the size says the fork should + * be in extent or btree format, the inode isn't claiming to be + * in local format. + */ if (be64_to_cpu(dip->di_size) > fork_size && fork_format == XFS_DINODE_FMT_LOCAL) return __this_address; }
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 38de567906d95c397d87f292b892686b7ec6fbc3
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 0aca73915dc1)
Note: The patch differs from the upstream commit: --- 1: 38de567906d95 ! 1: a037bc2828645 xfs: allow symlinks with short remote targets @@ Metadata ## Commit message ## xfs: allow symlinks with short remote targets
+ [ Upstream commit 38de567906d95c397d87f292b892686b7ec6fbc3 ] + An internal user complained about log recovery failing on a symlink ("Bad dinode after recovery") with the following (excerpted) format:
@@ Commit message Signed-off-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/libxfs/xfs_inode_buf.c ## @@ fs/xfs/libxfs/xfs_inode_buf.c: xfs_dinode_verify_fork( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: Wengang Wang wen.gang.wang@oracle.com
[ Upstream commit 58f880711f2ba53fd5e959875aff5b3bf6d5c32e ]
A user with a completely full filesystem experienced an unexpected shutdown when the filesystem tried to write the superblock during runtime. kernel shows the following dmesg:
[ 8.176281] XFS (dm-4): Metadata corruption detected at xfs_sb_write_verify+0x60/0x120 [xfs], xfs_sb block 0x0 [ 8.177417] XFS (dm-4): Unmount and run xfs_repair [ 8.178016] XFS (dm-4): First 128 bytes of corrupted metadata buffer: [ 8.178703] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 01 90 00 00 XFSB............ [ 8.179487] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [ 8.180312] 00000020: cf 12 dc 89 ca 26 45 29 92 e6 e3 8d 3b b8 a2 c3 .....&E)....;... [ 8.181150] 00000030: 00 00 00 00 01 00 00 06 00 00 00 00 00 00 00 80 ................ [ 8.182003] 00000040: 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82 ................ [ 8.182004] 00000050: 00 00 00 01 00 64 00 00 00 00 00 04 00 00 00 00 .....d.......... [ 8.182004] 00000060: 00 00 64 00 b4 a5 02 00 02 00 00 08 00 00 00 00 ..d............. [ 8.182005] 00000070: 00 00 00 00 00 00 00 00 0c 09 09 03 17 00 00 19 ................ [ 8.182008] XFS (dm-4): Corruption of in-memory data detected. Shutting down filesystem [ 8.182010] XFS (dm-4): Please unmount the filesystem and rectify the problem(s)
When xfs_log_sb writes super block to disk, b_fdblocks is fetched from m_fdblocks without any lock. As m_fdblocks can experience a positive -> negative -> positive changing when the FS reaches fullness (see xfs_mod_fdblocks). So there is a chance that sb_fdblocks is negative, and because sb_fdblocks is type of unsigned long long, it reads super big. And sb_fdblocks being bigger than sb_dblocks is a problem during log recovery, xfs_validate_sb_write() complains.
Fix: As sb_fdblocks will be re-calculated during mount when lazysbcount is enabled, We just need to make xfs_validate_sb_write() happy -- make sure sb_fdblocks is not nenative. This patch also takes care of other percpu counters in xfs_log_sb.
Signed-off-by: Wengang Wang wen.gang.wang@oracle.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/libxfs/xfs_sb.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 90ed55cd3d10..8e0a176b8e0b 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -1020,15 +1020,16 @@ xfs_log_sb( * sb counters, despite having a percpu counter. It is always kept * consistent with the ondisk rtbitmap by xfs_trans_apply_sb_deltas() * and hence we don't need have to update it here. */ if (xfs_has_lazysbcount(mp)) { - mp->m_sb.sb_icount = percpu_counter_sum(&mp->m_icount); + mp->m_sb.sb_icount = percpu_counter_sum_positive(&mp->m_icount); mp->m_sb.sb_ifree = min_t(uint64_t, - percpu_counter_sum(&mp->m_ifree), + percpu_counter_sum_positive(&mp->m_ifree), mp->m_sb.sb_icount); - mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks); + mp->m_sb.sb_fdblocks = + percpu_counter_sum_positive(&mp->m_fdblocks); }
xfs_sb_to_disk(bp->b_addr, &mp->m_sb); xfs_trans_buf_set_type(tp, bp, XFS_BLFT_SB_BUF); xfs_trans_log_buf(tp, bp, 0, sizeof(struct xfs_dsb) - 1);
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 58f880711f2ba53fd5e959875aff5b3bf6d5c32e
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Wengang Wangwen.gang.wang@oracle.com
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 0e52b98bf041)
Note: The patch differs from the upstream commit: --- 1: 58f880711f2ba ! 1: 621457ef92ee7 xfs: make sure sb_fdblocks is non-negative @@ Metadata ## Commit message ## xfs: make sure sb_fdblocks is non-negative
+ [ Upstream commit 58f880711f2ba53fd5e959875aff5b3bf6d5c32e ] + A user with a completely full filesystem experienced an unexpected shutdown when the filesystem tried to write the superblock during runtime. @@ Commit message Signed-off-by: Wengang Wang wen.gang.wang@oracle.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/libxfs/xfs_sb.c ## @@ fs/xfs/libxfs/xfs_sb.c: xfs_log_sb( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: Christoph Hellwig hch@lst.de
[ Upstream commit 610b29161b0aa9feb59b78dc867553274f17fb01 ]
xfs_can_free_eofblocks returns false for files that have persistent preallocations unless the force flag is passed and there are delayed blocks. This means it won't free delalloc reservations for files with persistent preallocations unless the force flag is set, and it will also free the persistent preallocations if the force flag is set and the file happens to have delayed allocations.
Both of these are bad, so do away with the force flag and always free only post-EOF delayed allocations for files with the XFS_DIFLAG_PREALLOC or APPEND flags set.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/xfs_bmap_util.c | 30 ++++++++++++++++++++++-------- fs/xfs/xfs_bmap_util.h | 2 +- fs/xfs/xfs_icache.c | 2 +- fs/xfs/xfs_inode.c | 14 ++++---------- 4 files changed, 28 insertions(+), 20 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 399451aab26a..62b92e92a685 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -634,17 +634,15 @@ xfs_bmap_punch_delalloc_range( return error; }
/* * Test whether it is appropriate to check an inode for and free post EOF - * blocks. The 'force' parameter determines whether we should also consider - * regular files that are marked preallocated or append-only. + * blocks. */ bool xfs_can_free_eofblocks( - struct xfs_inode *ip, - bool force) + struct xfs_inode *ip) { struct xfs_bmbt_irec imap; struct xfs_mount *mp = ip->i_mount; xfs_fileoff_t end_fsb; xfs_fileoff_t last_fsb; @@ -674,15 +672,15 @@ xfs_can_free_eofblocks( /* If we haven't read in the extent list, then don't do it now. */ if (xfs_need_iread_extents(&ip->i_df)) return false;
/* - * Do not free real preallocated or append-only files unless the file - * has delalloc blocks and we are forced to remove them. + * Only free real extents for inodes with persistent preallocations or + * the append-only flag. */ if (ip->i_diflags & (XFS_DIFLAG_PREALLOC | XFS_DIFLAG_APPEND)) - if (!force || ip->i_delayed_blks == 0) + if (ip->i_delayed_blks == 0) return false;
/* * Do not try to free post-EOF blocks if EOF is beyond the end of the * range supported by the page cache, because the truncation will loop @@ -732,10 +730,26 @@ xfs_free_eofblocks( return error;
/* Wait on dio to ensure i_size has settled. */ inode_dio_wait(VFS_I(ip));
+ /* + * For preallocated files only free delayed allocations. + * + * Note that this means we also leave speculative preallocations in + * place for preallocated files. + */ + if (ip->i_diflags & (XFS_DIFLAG_PREALLOC | XFS_DIFLAG_APPEND)) { + if (ip->i_delayed_blks) { + xfs_bmap_punch_delalloc_range(ip, + round_up(XFS_ISIZE(ip), mp->m_sb.sb_blocksize), + LLONG_MAX); + } + xfs_inode_clear_eofblocks_tag(ip); + return 0; + } + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp); if (error) { ASSERT(xfs_is_shutdown(mp)); return error; } @@ -1046,11 +1060,11 @@ xfs_prepare_shift(
/* * Trim eofblocks to avoid shifting uninitialized post-eof preallocation * into the accessible region of the file. */ - if (xfs_can_free_eofblocks(ip, true)) { + if (xfs_can_free_eofblocks(ip)) { error = xfs_free_eofblocks(ip); if (error) return error; }
diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h index 6888078f5c31..1383019ccdb7 100644 --- a/fs/xfs/xfs_bmap_util.h +++ b/fs/xfs/xfs_bmap_util.h @@ -61,11 +61,11 @@ int xfs_collapse_file_space(struct xfs_inode *, xfs_off_t offset, xfs_off_t len); int xfs_insert_file_space(struct xfs_inode *, xfs_off_t offset, xfs_off_t len);
/* EOF block manipulation functions */ -bool xfs_can_free_eofblocks(struct xfs_inode *ip, bool force); +bool xfs_can_free_eofblocks(struct xfs_inode *ip); int xfs_free_eofblocks(struct xfs_inode *ip);
int xfs_swap_extents(struct xfs_inode *ip, struct xfs_inode *tip, struct xfs_swapext *sx);
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 6df826fc787c..586d26c05160 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -1184,11 +1184,11 @@ xfs_inode_free_eofblocks( return -EAGAIN; return 0; } *lockflags |= XFS_IOLOCK_EXCL;
- if (xfs_can_free_eofblocks(ip, false)) + if (xfs_can_free_eofblocks(ip)) return xfs_free_eofblocks(ip);
/* inode could be preallocated or append-only */ trace_xfs_inode_free_eofblocks_invalid(ip); xfs_inode_clear_eofblocks_tag(ip); diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 26961b0dae03..b26d26d29273 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -1457,11 +1457,11 @@ xfs_release( * dropped, so we'll never leak blocks permanently. */ if (!xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) return 0;
- if (xfs_can_free_eofblocks(ip, false)) { + if (xfs_can_free_eofblocks(ip)) { /* * Check if the inode is being opened, written and closed * frequently and we have delayed allocation blocks outstanding * (e.g. streaming writes from the NFS server), truncating the * blocks past EOF will cause fragmentation to occur. @@ -1673,19 +1673,17 @@ xfs_inode_needs_inactive( if (VFS_I(ip)->i_nlink == 0) return true;
/* * This file isn't being freed, so check if there are post-eof blocks - * to free. @force is true because we are evicting an inode from the - * cache. Post-eof blocks must be freed, lest we end up with broken - * free space accounting. + * to free. * * Note: don't bother with iolock here since lockdep complains about * acquiring it in reclaim context. We have the only reference to the * inode at this point anyways. */ - return xfs_can_free_eofblocks(ip, true); + return xfs_can_free_eofblocks(ip); }
/* * xfs_inactive * @@ -1732,19 +1730,15 @@ xfs_inactive( goto out; }
if (VFS_I(ip)->i_nlink != 0) { /* - * force is true because we are evicting an inode from the - * cache. Post-eof blocks must be freed, lest we end up with - * broken free space accounting. - * * Note: don't bother with iolock here since lockdep complains * about acquiring it in reclaim context. We have the only * reference to the inode at this point anyways. */ - if (xfs_can_free_eofblocks(ip, true)) + if (xfs_can_free_eofblocks(ip)) error = xfs_free_eofblocks(ip);
goto out; }
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 610b29161b0aa9feb59b78dc867553274f17fb01
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Christoph Hellwighch@lst.de
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 2bc2d49c36c2)
Note: The patch differs from the upstream commit: --- 1: 610b29161b0aa ! 1: 1695ca72c95de xfs: fix freeing speculative preallocations for preallocated files @@ Metadata ## Commit message ## xfs: fix freeing speculative preallocations for preallocated files
+ [ Upstream commit 610b29161b0aa9feb59b78dc867553274f17fb01 ] + xfs_can_free_eofblocks returns false for files that have persistent preallocations unless the force flag is passed and there are delayed blocks. This means it won't free delalloc reservations for files @@ Commit message Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/xfs_bmap_util.c ## @@ fs/xfs/xfs_bmap_util.c: xfs_bmap_punch_delalloc_range( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 1ec9307fc066dd8a140d5430f8a7576aa9d78cd3 ]
For a very very long time, inode inactivation has set the inode size to zero before unmapping the extents associated with the data fork. Unfortunately, commit 3c6f46eacd876 changed the inode verifier to prohibit zero-length symlinks and directories. If an inode happens to get logged in this state and the system crashes before freeing the inode, log recovery will also fail on the broken inode.
Therefore, allow zero-size symlinks and directories as long as the link count is zero; nobody will be able to open these files by handle so there isn't any risk of data exposure.
Fixes: 3c6f46eacd876 ("xfs: sanity check directory inode di_size") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/libxfs/xfs_inode_buf.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 127c9a698d9f..3c611c8ac158 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -368,14 +368,17 @@ xfs_dinode_verify_fork( */ if (whichfork == XFS_DATA_FORK) { /* * A directory small enough to fit in the inode must be stored * in local format. The directory sf <-> extents conversion - * code updates the directory size accordingly. + * code updates the directory size accordingly. Directories + * being truncated have zero size and are not subject to this + * check. */ if (S_ISDIR(mode)) { - if (be64_to_cpu(dip->di_size) <= fork_size && + if (dip->di_size && + be64_to_cpu(dip->di_size) <= fork_size && fork_format != XFS_DINODE_FMT_LOCAL) return __this_address; }
/* @@ -509,13 +512,23 @@ xfs_dinode_verify(
mode = be16_to_cpu(dip->di_mode); if (mode && xfs_mode_to_ftype(mode) == XFS_DIR3_FT_UNKNOWN) return __this_address;
- /* No zero-length symlinks/dirs. */ - if ((S_ISLNK(mode) || S_ISDIR(mode)) && di_size == 0) - return __this_address; + /* + * No zero-length symlinks/dirs unless they're unlinked and hence being + * inactivated. + */ + if ((S_ISLNK(mode) || S_ISDIR(mode)) && di_size == 0) { + if (dip->di_version > 1) { + if (dip->di_nlink) + return __this_address; + } else { + if (dip->di_onlink) + return __this_address; + } + }
fa = xfs_dinode_verify_nrext64(mp, dip); if (fa) return fa;
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 1ec9307fc066dd8a140d5430f8a7576aa9d78cd3
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 3eeac3311683)
Note: The patch differs from the upstream commit: --- 1: 1ec9307fc066d ! 1: 2121dc3036b86 xfs: allow unlinked symlinks and dirs with zero size @@ Metadata ## Commit message ## xfs: allow unlinked symlinks and dirs with zero size
+ [ Upstream commit 1ec9307fc066dd8a140d5430f8a7576aa9d78cd3 ] + For a very very long time, inode inactivation has set the inode size to zero before unmapping the extents associated with the data fork. Unfortunately, commit 3c6f46eacd876 changed the inode verifier to @@ Commit message Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/libxfs/xfs_inode_buf.c ## @@ fs/xfs/libxfs/xfs_inode_buf.c: xfs_dinode_verify_fork( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 288e1f693f04e66be99f27e7cbe4a45936a66745 ]
xfs/205 produces the following failure when always_cow is enabled:
--- a/tests/xfs/205.out 2024-02-28 16:20:24.437887970 -0800 +++ b/tests/xfs/205.out.bad 2024-06-03 21:13:40.584000000 -0700 @@ -1,4 +1,5 @@ QA output created by 205 *** one file + !!! disk full (expected) *** one file, a few bytes at a time *** done
This is the result of overly aggressive attempts to align cow fork delalloc reservations to the CoW extent size hint. Looking at the trace data, we're trying to append a single fsblock to the "fred" file. Trying to create a speculative post-eof reservation fails because there's not enough space.
We then set @prealloc_blocks to zero and try again, but the cowextsz alignment code triggers, which expands our request for a 1-fsblock reservation into a 39-block reservation. There's not enough space for that, so the whole write fails with ENOSPC even though there's sufficient space in the filesystem to allocate the single block that we need to land the write.
There are two things wrong here -- first, we shouldn't be attempting speculative preallocations beyond what was requested when we're low on space. Second, if we've already computed a posteof preallocation, we shouldn't bother trying to align that to the cowextsize hint.
Fix both of these problems by adding a flag that only enables the expansion of the delalloc reservation to the cowextsize if we're doing a non-extending write, and only if we're not doing an ENOSPC retry. This requires us to move the ENOSPC retry logic to xfs_bmapi_reserve_delalloc.
I probably should have caught this six years ago when 6ca30729c206d was being reviewed, but oh well. Update the comments to reflect what the code does now.
Fixes: 6ca30729c206d ("xfs: bmap code cleanup") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/libxfs/xfs_bmap.c | 31 +++++++++++++++++++++++++++---- fs/xfs/xfs_iomap.c | 34 ++++++++++++---------------------- 2 files changed, 39 insertions(+), 26 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index a3c4d4a442af..14b0d230f61b 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -3957,43 +3957,55 @@ xfs_bmapi_reserve_delalloc( struct xfs_mount *mp = ip->i_mount; struct xfs_ifork *ifp = xfs_ifork_ptr(ip, whichfork); xfs_extlen_t alen; xfs_extlen_t indlen; int error; - xfs_fileoff_t aoff = off; + xfs_fileoff_t aoff; + bool use_cowextszhint = + whichfork == XFS_COW_FORK && !prealloc;
+retry: /* * Cap the alloc length. Keep track of prealloc so we know whether to * tag the inode before we return. */ + aoff = off; alen = XFS_FILBLKS_MIN(len + prealloc, XFS_MAX_BMBT_EXTLEN); if (!eof) alen = XFS_FILBLKS_MIN(alen, got->br_startoff - aoff); if (prealloc && alen >= len) prealloc = alen - len;
- /* Figure out the extent size, adjust alen */ - if (whichfork == XFS_COW_FORK) { + /* + * If we're targetting the COW fork but aren't creating a speculative + * posteof preallocation, try to expand the reservation to align with + * the COW extent size hint if there's sufficient free space. + * + * Unlike the data fork, the CoW cancellation functions will free all + * the reservations at inactivation, so we don't require that every + * delalloc reservation have a dirty pagecache. + */ + if (use_cowextszhint) { struct xfs_bmbt_irec prev; xfs_extlen_t extsz = xfs_get_cowextsz_hint(ip);
if (!xfs_iext_peek_prev_extent(ifp, icur, &prev)) prev.br_startoff = NULLFILEOFF;
error = xfs_bmap_extsize_align(mp, got, &prev, extsz, 0, eof, 1, 0, &aoff, &alen); ASSERT(!error); }
/* * Make a transaction-less quota reservation for delayed allocation * blocks. This number gets adjusted later. We return if we haven't * allocated blocks already inside this loop. */ error = xfs_quota_reserve_blkres(ip, alen); if (error) - return error; + goto out;
/* * Split changing sb for alen and indlen since they could be coming * from different places. */ @@ -4034,10 +4046,21 @@ xfs_bmapi_reserve_delalloc( out_unreserve_blocks: xfs_mod_fdblocks(mp, alen, false); out_unreserve_quota: if (XFS_IS_QUOTA_ON(mp)) xfs_quota_unreserve_blkres(ip, alen); +out: + if (error == -ENOSPC || error == -EDQUOT) { + trace_xfs_delalloc_enospc(ip, off, len); + + if (prealloc || use_cowextszhint) { + /* retry without any preallocation */ + use_cowextszhint = false; + prealloc = 0; + goto retry; + } + } return error; }
static int xfs_bmap_alloc_userdata( diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index fab191a09442..28a1c19dfdb3 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -1113,37 +1113,27 @@ xfs_buffered_write_iomap_begin( ASSERT(p_end_fsb > offset_fsb); prealloc_blocks = p_end_fsb - end_fsb; } }
-retry: - error = xfs_bmapi_reserve_delalloc(ip, allocfork, offset_fsb, - end_fsb - offset_fsb, prealloc_blocks, - allocfork == XFS_DATA_FORK ? &imap : &cmap, - allocfork == XFS_DATA_FORK ? &icur : &ccur, - allocfork == XFS_DATA_FORK ? eof : cow_eof); - switch (error) { - case 0: - break; - case -ENOSPC: - case -EDQUOT: - /* retry without any preallocation */ - trace_xfs_delalloc_enospc(ip, offset, count); - if (prealloc_blocks) { - prealloc_blocks = 0; - goto retry; - } - fallthrough; - default: - goto out_unlock; - } - if (allocfork == XFS_COW_FORK) { + error = xfs_bmapi_reserve_delalloc(ip, allocfork, offset_fsb, + end_fsb - offset_fsb, prealloc_blocks, &cmap, + &ccur, cow_eof); + if (error) + goto out_unlock; + trace_xfs_iomap_alloc(ip, offset, count, allocfork, &cmap); goto found_cow; }
+ error = xfs_bmapi_reserve_delalloc(ip, allocfork, offset_fsb, + end_fsb - offset_fsb, prealloc_blocks, &imap, &icur, + eof); + if (error) + goto out_unlock; + /* * Flag newly allocated delalloc blocks with IOMAP_F_NEW so we punch * them out if the write happens to fail. */ seq = xfs_iomap_inode_sequence(ip, IOMAP_F_NEW);
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 288e1f693f04e66be99f27e7cbe4a45936a66745
WARNING: Author mismatch between patch and upstream commit: Backport author: Leah Rumancikleah.rumancik@gmail.com Commit author: Darrick J. Wongdjwong@kernel.org
Status in newer kernel trees: 6.14.y | Present (exact SHA1) 6.12.y | Present (exact SHA1) 6.6.y | Present (different SHA1: 9a0ab4fc28ed)
Note: The patch differs from the upstream commit: --- 1: 288e1f693f04e ! 1: 982df6cb2b8c2 xfs: restrict when we try to align cow fork delalloc to cowextsz hints @@ Metadata ## Commit message ## xfs: restrict when we try to align cow fork delalloc to cowextsz hints
+ [ Upstream commit 288e1f693f04e66be99f27e7cbe4a45936a66745 ] + xfs/205 produces the following failure when always_cow is enabled:
--- a/tests/xfs/205.out 2024-02-28 16:20:24.437887970 -0800 @@ Commit message Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Chandan Babu R chandanbabu@kernel.org + Signed-off-by: Leah Rumancik leah.rumancik@gmail.com + Acked-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/libxfs/xfs_bmap.c ## @@ fs/xfs/libxfs/xfs_bmap.c: xfs_bmapi_reserve_delalloc( + xfs_extlen_t alen; xfs_extlen_t indlen; - uint64_t fdblocks; int error; - xfs_fileoff_t aoff = off; + xfs_fileoff_t aoff; ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
linux-stable-mirror@lists.linaro.org