From: Chuck Lever <chuck.lever(a)oracle.com>
J. David reports an odd corruption of a READDIR reply sent to a
FreeBSD client.
xdr_reserve_space() has to do a special trick when the @nbytes value
requests more space than there is in the current page of the XDR
buffer.
In that case, xdr_reserve_space() returns a pointer to the start of
the next page, and then the next call to xdr_reserve_space() invokes
__xdr_commit_encode() to copy enough of the data item back into the
previous page to make that data item contiguous across the page
boundary.
But we need to be careful in the case where buffer space is reserved
early for a data item that will be inserted into the buffer later.
One such caller, nfsd4_encode_operation(), reserves 8 bytes in the
encoding buffer for each COMPOUND operation. However, a READDIR
result can sometimes encode file names so that there are only 4
bytes left at the end of the current XDR buffer page (though plenty
of pages are left to handle the remaining encoding tasks).
If a COMPOUND operation follows the READDIR result (say, a GETATTR),
then nfsd4_encode_operation() will reserve 8 bytes for the op number
(9) and the op status (usually NFS4_OK). In this weird case,
xdr_reserve_space() returns a pointer to byte zero of the next buffer
page, as it assumes the data item will be copied back into place (in
the previous page) on the next call to xdr_reserve_space().
nfsd4_encode_operation() writes the op num into the buffer, then
saves the next 4-byte location for the op's status code. The next
xdr_reserve_space() call is part of GETATTR encoding, so the op num
gets copied back into the previous page, but the saved location for
the op status continues to point to the wrong spot in the current
XDR buffer page because __xdr_commit_encode() moved that data item.
After GETATTR encoding is complete, nfsd4_encode_operation() writes
the op status over the first XDR data item in the GETATTR result.
The NFS4_OK status code (0) makes it look like there are zero items
in the GETATTR's attribute bitmask.
The patch description of commit 2825a7f90753 ("nfsd4: allow encoding
across page boundaries") [2014] remarks that NFSD "can't handle a
new operation starting close to the end of a page." This behavior
appears to be one reason for that remark.
Break up the reservation of the COMPOUND op num and op status data
items into two distinct 4-octet reservations. Thanks to XDR data
item alignment restrictions, a 4-octet buffer reservation can never
straddle a page boundary.
Reported-by: J David <j.david.lists(a)gmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
---
fs/nfsd/nfs4xdr.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 53fac037611c..8780da884197 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -5764,10 +5764,11 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
nfsd4_enc encoder;
__be32 *p;
- p = xdr_reserve_space(xdr, 8);
+ if (xdr_stream_encode_u32(xdr, op->opnum) != XDR_UNIT)
+ goto release;
+ p = xdr_reserve_space(xdr, XDR_UNIT);
if (!p)
goto release;
- *p++ = cpu_to_be32(op->opnum);
post_err_offset = xdr->buf->len;
if (op->opnum == OP_ILLEGAL)
--
2.47.0
Hi all,
This is the latest revision of a patchset that adds to XFS kernel
support for reverse mapping for the realtime device. This time around
I've fixed some of the bitrot that I've noticed over the past few
months, and most notably have converted rtrmapbt to use the metadata
inode directory feature instead of burning more space in the superblock.
At the beginning of the set are patches to implement storing B+tree
leaves in an inode root, since the realtime rmapbt is rooted in an
inode, unlike the regular rmapbt which is rooted in an AG block.
Prior to this, the only btree that could be rooted in the inode fork
was the block mapping btree; if all the extent records fit in the
inode, format would be switched from 'btree' to 'extents'.
The next few patches enhance the reverse mapping routines to handle
the parts that are specific to rtgroups -- adding the new btree type,
adding a new log intent item type, and wiring up the metadata directory
tree entries.
Finally, implement GETFSMAP with the rtrmapbt and scrub functionality
for the rtrmapbt and rtbitmap and online fsck functionality.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=re…
xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h…
fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h…
xfsdocs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-documentation.git/l…
---
Commits in this patchset:
* xfs: add some rtgroup inode helpers
* xfs: prepare rmap btree cursor tracepoints for realtime
* xfs: simplify the xfs_rmap_{alloc,free}_extent calling conventions
* xfs: introduce realtime rmap btree ondisk definitions
* xfs: realtime rmap btree transaction reservations
* xfs: add realtime rmap btree operations
* xfs: prepare rmap functions to deal with rtrmapbt
* xfs: add a realtime flag to the rmap update log redo items
* xfs: support recovering rmap intent items targetting realtime extents
* xfs: pretty print metadata file types in error messages
* xfs: support file data forks containing metadata btrees
* xfs: add realtime reverse map inode to metadata directory
* xfs: add metadata reservations for realtime rmap btrees
* xfs: wire up a new metafile type for the realtime rmap
* xfs: wire up rmap map and unmap to the realtime rmapbt
* xfs: create routine to allocate and initialize a realtime rmap btree inode
* xfs: wire up getfsmap to the realtime reverse mapping btree
* xfs: check that the rtrmapbt maxlevels doesn't increase when growing fs
* xfs: report realtime rmap btree corruption errors to the health system
* xfs: allow queued realtime intents to drain before scrubbing
* xfs: scrub the realtime rmapbt
* xfs: cross-reference realtime bitmap to realtime rmapbt scrubber
* xfs: cross-reference the realtime rmapbt
* xfs: scan rt rmap when we're doing an intense rmap check of bmbt mappings
* xfs: scrub the metadir path of rt rmap btree files
* xfs: walk the rt reverse mapping tree when rebuilding rmap
* xfs: online repair of realtime file bmaps
* xfs: repair inodes that have realtime extents
* xfs: repair rmap btree inodes
* xfs: online repair of realtime bitmaps for a realtime group
* xfs: support repairing metadata btrees rooted in metadir inodes
* xfs: online repair of the realtime rmap btree
* xfs: create a shadow rmap btree during realtime rmap repair
* xfs: hook live realtime rmap operations during a repair operation
* xfs: don't shut down the filesystem for media failures beyond end of log
* xfs: react to fsdax failure notifications on the rt device
* xfs: enable realtime rmap btree
---
fs/xfs/Makefile | 3
fs/xfs/libxfs/xfs_btree.c | 73 +++
fs/xfs/libxfs/xfs_btree.h | 8
fs/xfs/libxfs/xfs_btree_mem.c | 1
fs/xfs/libxfs/xfs_btree_staging.c | 1
fs/xfs/libxfs/xfs_defer.h | 1
fs/xfs/libxfs/xfs_exchmaps.c | 4
fs/xfs/libxfs/xfs_format.h | 28 +
fs/xfs/libxfs/xfs_fs.h | 7
fs/xfs/libxfs/xfs_health.h | 4
fs/xfs/libxfs/xfs_inode_buf.c | 32 +
fs/xfs/libxfs/xfs_inode_fork.c | 25 +
fs/xfs/libxfs/xfs_log_format.h | 6
fs/xfs/libxfs/xfs_log_recover.h | 2
fs/xfs/libxfs/xfs_metafile.c | 18 +
fs/xfs/libxfs/xfs_metafile.h | 2
fs/xfs/libxfs/xfs_ondisk.h | 2
fs/xfs/libxfs/xfs_refcount.c | 6
fs/xfs/libxfs/xfs_rmap.c | 171 +++++-
fs/xfs/libxfs/xfs_rmap.h | 12
fs/xfs/libxfs/xfs_rtbitmap.c | 2
fs/xfs/libxfs/xfs_rtbitmap.h | 9
fs/xfs/libxfs/xfs_rtgroup.c | 53 +-
fs/xfs/libxfs/xfs_rtgroup.h | 49 ++
fs/xfs/libxfs/xfs_rtrmap_btree.c | 1011 +++++++++++++++++++++++++++++++++++++
fs/xfs/libxfs/xfs_rtrmap_btree.h | 210 ++++++++
fs/xfs/libxfs/xfs_sb.c | 6
fs/xfs/libxfs/xfs_shared.h | 14 +
fs/xfs/libxfs/xfs_trans_resv.c | 12
fs/xfs/libxfs/xfs_trans_space.h | 13
fs/xfs/scrub/alloc_repair.c | 5
fs/xfs/scrub/bmap.c | 108 +++-
fs/xfs/scrub/bmap_repair.c | 129 +++++
fs/xfs/scrub/common.c | 160 ++++++
fs/xfs/scrub/common.h | 23 +
fs/xfs/scrub/health.c | 1
fs/xfs/scrub/inode.c | 10
fs/xfs/scrub/inode_repair.c | 136 +++++
fs/xfs/scrub/metapath.c | 3
fs/xfs/scrub/newbt.c | 42 ++
fs/xfs/scrub/newbt.h | 1
fs/xfs/scrub/reap.c | 41 ++
fs/xfs/scrub/reap.h | 2
fs/xfs/scrub/repair.c | 191 +++++++
fs/xfs/scrub/repair.h | 17 +
fs/xfs/scrub/rgsuper.c | 6
fs/xfs/scrub/rmap_repair.c | 84 +++
fs/xfs/scrub/rtbitmap.c | 75 ++-
fs/xfs/scrub/rtbitmap.h | 55 ++
fs/xfs/scrub/rtbitmap_repair.c | 429 +++++++++++++++-
fs/xfs/scrub/rtrmap.c | 271 ++++++++++
fs/xfs/scrub/rtrmap_repair.c | 903 +++++++++++++++++++++++++++++++++
fs/xfs/scrub/rtsummary.c | 17 -
fs/xfs/scrub/rtsummary_repair.c | 3
fs/xfs/scrub/scrub.c | 11
fs/xfs/scrub/scrub.h | 14 +
fs/xfs/scrub/stats.c | 1
fs/xfs/scrub/tempexch.h | 2
fs/xfs/scrub/tempfile.c | 20 -
fs/xfs/scrub/trace.c | 1
fs/xfs/scrub/trace.h | 228 ++++++++
fs/xfs/xfs_buf.c | 1
fs/xfs/xfs_buf_item_recover.c | 4
fs/xfs/xfs_drain.c | 20 -
fs/xfs/xfs_drain.h | 7
fs/xfs/xfs_fsmap.c | 174 ++++++
fs/xfs/xfs_fsops.c | 11
fs/xfs/xfs_health.c | 1
fs/xfs/xfs_inode.c | 19 +
fs/xfs/xfs_inode_item.c | 2
fs/xfs/xfs_inode_item_recover.c | 44 +-
fs/xfs/xfs_log_recover.c | 2
fs/xfs/xfs_mount.c | 5
fs/xfs/xfs_mount.h | 9
fs/xfs/xfs_notify_failure.c | 230 +++++---
fs/xfs/xfs_notify_failure.h | 11
fs/xfs/xfs_qm.c | 8
fs/xfs/xfs_rmap_item.c | 216 +++++++-
fs/xfs/xfs_rtalloc.c | 82 ++-
fs/xfs/xfs_rtalloc.h | 10
fs/xfs/xfs_stats.c | 4
fs/xfs/xfs_stats.h | 2
fs/xfs/xfs_super.c | 6
fs/xfs/xfs_super.h | 1
fs/xfs/xfs_trace.h | 104 ++--
85 files changed, 5381 insertions(+), 366 deletions(-)
create mode 100644 fs/xfs/libxfs/xfs_rtrmap_btree.c
create mode 100644 fs/xfs/libxfs/xfs_rtrmap_btree.h
create mode 100644 fs/xfs/scrub/rtrmap.c
create mode 100644 fs/xfs/scrub/rtrmap_repair.c
create mode 100644 fs/xfs/xfs_notify_failure.h
Hi all,
Bug fixes for 6.13.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=xf…
xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h…
---
Commits in this patchset:
* xfs: don't over-report free space or inodes in statvfs
* xfs: release the dquot buf outside of qli_lock
---
fs/xfs/xfs_dquot.c | 12 ++++++++----
fs/xfs/xfs_qm_bhv.c | 27 +++++++++++++++++----------
2 files changed, 25 insertions(+), 14 deletions(-)
Hi Andrey,
Please pull this branch with changes for xfsprogs for 6.11-rc1.
As usual, I did a test-merge with the main upstream branch as of a few
minutes ago, and didn't see any conflicts. Please let me know if you
encounter any problems.
--D
The following changes since commit 513300e9565b0d446ac8e6a3a990444d766c728b:
mkfs: add a utility to generate protofiles (2024-12-23 13:05:10 -0800)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git tags/xfs-6.13-merge_2024-12-23
for you to fetch changes up to 80b81c84f015ee01fed80c32184cc763ee1a655e:
xfs: return from xfs_symlink_verify early on V4 filesystems (2024-12-23 13:05:13 -0800)
----------------------------------------------------------------
xfsprogs: new code for 6.13 [05/23]
New code for 6.12.
This has been running on the djcloud for months with no problems. Enjoy!
Signed-off-by: "Darrick J. Wong" <djwong(a)kernel.org>
----------------------------------------------------------------
Christoph Hellwig (9):
xfs: add a xfs_bmap_free_rtblocks helper
xfs: move RT bitmap and summary information to the rtgroup
xfs: support creating per-RTG files in growfs
xfs: refactor xfs_rtbitmap_blockcount
xfs: refactor xfs_rtsummary_blockcount
xfs: make RT extent numbers relative to the rtgroup
xfs: add a helper to prevent bmap merges across rtgroup boundaries
xfs: make the RT allocator rtgroup aware
xfs: don't call xfs_bmap_same_rtgroup in xfs_bmap_add_extent_hole_delay
Darrick J. Wong (40):
xfs: create incore realtime group structures
xfs: define locking primitives for realtime groups
xfs: add a lockdep class key for rtgroup inodes
xfs: support caching rtgroup metadata inodes
libfrog: add memchr_inv
xfs: define the format of rt groups
xfs: update realtime super every time we update the primary fs super
xfs: export realtime group geometry via XFS_FSOP_GEOM
xfs: check that rtblock extents do not break rtsupers or rtgroups
xfs: add frextents to the lazysbcounters when rtgroups enabled
xfs: record rt group metadata errors in the health system
xfs: export the geometry of realtime groups to userspace
xfs: add block headers to realtime bitmap and summary blocks
xfs: encode the rtbitmap in big endian format
xfs: encode the rtsummary in big endian format
xfs: grow the realtime section when realtime groups are enabled
xfs: support logging EFIs for realtime extents
xfs: support error injection when freeing rt extents
xfs: use realtime EFI to free extents when rtgroups are enabled
xfs: don't merge ioends across RTGs
xfs: scrub the realtime group superblock
xfs: scrub metadir paths for rtgroup metadata
xfs: mask off the rtbitmap and summary inodes when metadir in use
xfs: create helpers to deal with rounding xfs_fileoff_t to rtx boundaries
xfs: create helpers to deal with rounding xfs_filblks_t to rtx boundaries
xfs: make xfs_rtblock_t a segmented address like xfs_fsblock_t
xfs: adjust min_block usage in xfs_verify_agbno
xfs: move the min and max group block numbers to xfs_group
xfs: implement busy extent tracking for rtgroups
xfs: use metadir for quota inodes
xfs: scrub quota file metapaths
xfs: enable metadata directory feature
xfs: convert struct typedefs in xfs_ondisk.h
xfs: separate space btree structures in xfs_ondisk.h
xfs: port ondisk structure checks from xfs/122 to the kernel
xfs: return a 64-bit block count from xfs_btree_count_blocks
xfs: fix error bailout in xfs_rtginode_create
xfs: update btree keys correctly when _insrec splits an inode root block
xfs: fix sb_spino_align checks for large fsblock sizes
xfs: return from xfs_symlink_verify early on V4 filesystems
Dave Chinner (1):
xfs: fix sparse inode limits on runt AG
Jeff Layton (1):
xfs: switch to multigrain timestamps
Long Li (1):
xfs: remove unknown compat feature check in superblock write validation
db/block.c | 2 +-
db/block.h | 16 -
db/convert.c | 1 -
db/faddr.c | 1 -
include/libxfs.h | 2 +
include/platform_defs.h | 33 +++
include/xfs_mount.h | 30 +-
include/xfs_trace.h | 7 +
include/xfs_trans.h | 1 +
libfrog/util.c | 14 +
libfrog/util.h | 4 +
libxfs/Makefile | 2 +
libxfs/init.c | 35 ++-
libxfs/libxfs_api_defs.h | 16 +
libxfs/libxfs_io.h | 1 +
libxfs/libxfs_priv.h | 34 +--
libxfs/rdwr.c | 17 ++
libxfs/trans.c | 29 ++
libxfs/util.c | 8 +-
libxfs/xfs_ag.c | 22 +-
libxfs/xfs_ag.h | 16 +-
libxfs/xfs_alloc.c | 15 +-
libxfs/xfs_alloc.h | 12 +-
libxfs/xfs_bmap.c | 124 ++++++--
libxfs/xfs_btree.c | 33 ++-
libxfs/xfs_btree.h | 2 +-
libxfs/xfs_defer.c | 6 +
libxfs/xfs_defer.h | 1 +
libxfs/xfs_dquot_buf.c | 190 ++++++++++++
libxfs/xfs_format.h | 80 ++++-
libxfs/xfs_fs.h | 32 +-
libxfs/xfs_group.h | 33 +++
libxfs/xfs_health.h | 42 +--
libxfs/xfs_ialloc.c | 16 +-
libxfs/xfs_ialloc_btree.c | 6 +-
libxfs/xfs_log_format.h | 6 +-
libxfs/xfs_ondisk.h | 186 +++++++++---
libxfs/xfs_quota_defs.h | 43 +++
libxfs/xfs_rtbitmap.c | 405 +++++++++++++++++---------
libxfs/xfs_rtbitmap.h | 247 ++++++++++------
libxfs/xfs_rtgroup.c | 694 ++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_rtgroup.h | 284 ++++++++++++++++++
libxfs/xfs_sb.c | 246 ++++++++++++++--
libxfs/xfs_sb.h | 6 +-
libxfs/xfs_shared.h | 4 +
libxfs/xfs_symlink_remote.c | 4 +-
libxfs/xfs_trans_inode.c | 6 +-
libxfs/xfs_trans_resv.c | 2 +-
libxfs/xfs_types.c | 35 ++-
libxfs/xfs_types.h | 8 +-
mkfs/proto.c | 33 ++-
mkfs/xfs_mkfs.c | 8 +
repair/dinode.c | 4 +-
repair/phase6.c | 203 +++++++------
repair/rt.c | 34 +--
repair/rt.h | 4 +-
56 files changed, 2728 insertions(+), 617 deletions(-)
create mode 100644 libxfs/xfs_rtgroup.c
create mode 100644 libxfs/xfs_rtgroup.h
Hi all,
New code for 6.12.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=xf…
xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h…
---
Commits in this patchset:
* xfs: create incore realtime group structures
* xfs: define locking primitives for realtime groups
* xfs: add a lockdep class key for rtgroup inodes
* xfs: support caching rtgroup metadata inodes
* xfs: add a xfs_bmap_free_rtblocks helper
* xfs: move RT bitmap and summary information to the rtgroup
* xfs: support creating per-RTG files in growfs
* xfs: refactor xfs_rtbitmap_blockcount
* xfs: refactor xfs_rtsummary_blockcount
* xfs: make RT extent numbers relative to the rtgroup
* libfrog: add memchr_inv
* xfs: define the format of rt groups
* xfs: update realtime super every time we update the primary fs super
* xfs: export realtime group geometry via XFS_FSOP_GEOM
* xfs: check that rtblock extents do not break rtsupers or rtgroups
* xfs: add a helper to prevent bmap merges across rtgroup boundaries
* xfs: add frextents to the lazysbcounters when rtgroups enabled
* xfs: record rt group metadata errors in the health system
* xfs: export the geometry of realtime groups to userspace
* xfs: add block headers to realtime bitmap and summary blocks
* xfs: encode the rtbitmap in big endian format
* xfs: encode the rtsummary in big endian format
* xfs: grow the realtime section when realtime groups are enabled
* xfs: support logging EFIs for realtime extents
* xfs: support error injection when freeing rt extents
* xfs: use realtime EFI to free extents when rtgroups are enabled
* xfs: don't merge ioends across RTGs
* xfs: make the RT allocator rtgroup aware
* xfs: scrub the realtime group superblock
* xfs: scrub metadir paths for rtgroup metadata
* xfs: mask off the rtbitmap and summary inodes when metadir in use
* xfs: create helpers to deal with rounding xfs_fileoff_t to rtx boundaries
* xfs: create helpers to deal with rounding xfs_filblks_t to rtx boundaries
* xfs: make xfs_rtblock_t a segmented address like xfs_fsblock_t
* xfs: adjust min_block usage in xfs_verify_agbno
* xfs: move the min and max group block numbers to xfs_group
* xfs: implement busy extent tracking for rtgroups
* xfs: use metadir for quota inodes
* xfs: scrub quota file metapaths
* xfs: enable metadata directory feature
* xfs: convert struct typedefs in xfs_ondisk.h
* xfs: separate space btree structures in xfs_ondisk.h
* xfs: port ondisk structure checks from xfs/122 to the kernel
* xfs: remove unknown compat feature check in superblock write validation
* xfs: fix sparse inode limits on runt AG
* xfs: switch to multigrain timestamps
* xfs: don't call xfs_bmap_same_rtgroup in xfs_bmap_add_extent_hole_delay
* xfs: return a 64-bit block count from xfs_btree_count_blocks
* xfs: fix error bailout in xfs_rtginode_create
* xfs: update btree keys correctly when _insrec splits an inode root block
* xfs: fix sb_spino_align checks for large fsblock sizes
* xfs: return from xfs_symlink_verify early on V4 filesystems
---
db/block.c | 2
db/block.h | 16 -
db/convert.c | 1
db/faddr.c | 1
include/libxfs.h | 2
include/platform_defs.h | 33 ++
include/xfs_mount.h | 30 +-
include/xfs_trace.h | 7
include/xfs_trans.h | 1
libfrog/util.c | 14 +
libfrog/util.h | 4
libxfs/Makefile | 2
libxfs/init.c | 35 ++
libxfs/libxfs_api_defs.h | 16 +
libxfs/libxfs_io.h | 1
libxfs/libxfs_priv.h | 34 --
libxfs/rdwr.c | 17 +
libxfs/trans.c | 29 ++
libxfs/util.c | 8
libxfs/xfs_ag.c | 22 +
libxfs/xfs_ag.h | 16 -
libxfs/xfs_alloc.c | 15 +
libxfs/xfs_alloc.h | 12 +
libxfs/xfs_bmap.c | 124 ++++++--
libxfs/xfs_btree.c | 33 ++
libxfs/xfs_btree.h | 2
libxfs/xfs_defer.c | 6
libxfs/xfs_defer.h | 1
libxfs/xfs_dquot_buf.c | 190 ++++++++++++
libxfs/xfs_format.h | 80 +++++
libxfs/xfs_fs.h | 32 ++
libxfs/xfs_group.h | 33 ++
libxfs/xfs_health.h | 42 ++-
libxfs/xfs_ialloc.c | 16 +
libxfs/xfs_ialloc_btree.c | 6
libxfs/xfs_log_format.h | 6
libxfs/xfs_ondisk.h | 186 +++++++++---
libxfs/xfs_quota_defs.h | 43 +++
libxfs/xfs_rtbitmap.c | 405 +++++++++++++++++--------
libxfs/xfs_rtbitmap.h | 247 ++++++++++-----
libxfs/xfs_rtgroup.c | 694 +++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_rtgroup.h | 284 ++++++++++++++++++
libxfs/xfs_sb.c | 246 ++++++++++++++-
libxfs/xfs_sb.h | 6
libxfs/xfs_shared.h | 4
libxfs/xfs_symlink_remote.c | 4
libxfs/xfs_trans_inode.c | 6
libxfs/xfs_trans_resv.c | 2
libxfs/xfs_types.c | 35 ++
libxfs/xfs_types.h | 8
mkfs/proto.c | 33 +-
mkfs/xfs_mkfs.c | 8
repair/dinode.c | 4
repair/phase6.c | 203 ++++++-------
repair/rt.c | 34 --
repair/rt.h | 4
56 files changed, 2728 insertions(+), 617 deletions(-)
create mode 100644 libxfs/xfs_rtgroup.c
create mode 100644 libxfs/xfs_rtgroup.h
The Host Port (i.e. CPU facing port) of CPSW receives traffic from Linux
via TX DMA Channels which are Hardware Queues consisting of traffic
categorized according to their priority. The Host Port is configured to
dequeue traffic from these Hardware Queues on the basis of priority i.e.
as long as traffic exists on a Hardware Queue of a higher priority, the
traffic on Hardware Queues of lower priority isn't dequeued. An alternate
operation is also supported wherein traffic can be dequeued by the Host
Port in a Round-Robin manner.
Until [0], the am65-cpsw driver enabled a single TX DMA Channel, due to
which, unless modified by user via "ethtool", all traffic from Linux is
transmitted on DMA Channel 0. Therefore, configuring the Host Port for
priority based dequeuing or Round-Robin operation is identical since
there is a single DMA Channel.
Since [0], all 8 TX DMA Channels are enabled by default. Additionally,
the default "tc mapping" doesn't take into account the possibility of
different traffic profiles which various users might have. This results
in traffic starvation at the Host Port due to the priority based dequeuing
which has been enabled by default since the inception of the driver. The
traffic starvation triggers NETDEV WATCHDOG timeout for all TX DMA Channels
that haven't been serviced due to the presence of traffic on the higher
priority TX DMA Channels.
Fix this by defaulting to Round-Robin dequeuing at the Host Port, which
shall ensure that traffic is dequeued from all TX DMA Channels irrespective
of the traffic profile. This will address the NETDEV WATCHDOG timeouts.
At the same time, users can still switch from Round-Robin to Priority
based dequeuing at the Host Port with the help of the "p0-rx-ptype-rrobin"
private flag of "ethtool". Users are expected to setup an appropriate
"tc mapping" that suits their traffic profile when switching to priority
based dequeuing at the Host Port.
[0] commit be397ea3473d ("net: ethernet: am65-cpsw: Set default TX channels to maximum")
Fixes: be397ea3473d ("net: ethernet: am65-cpsw: Set default TX channels to maximum")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Siddharth Vadapalli <s-vadapalli(a)ti.com>
---
Hello,
This patch is based on commit
8faabc041a00 Merge tag 'net-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
of Mainline Linux.
Regards,
Siddharth.
drivers/net/ethernet/ti/am65-cpsw-nuss.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
index 14e1df721f2e..5465bf872734 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
@@ -3551,7 +3551,7 @@ static int am65_cpsw_nuss_probe(struct platform_device *pdev)
init_completion(&common->tdown_complete);
common->tx_ch_num = AM65_CPSW_DEFAULT_TX_CHNS;
common->rx_ch_num_flows = AM65_CPSW_DEFAULT_RX_CHN_FLOWS;
- common->pf_p0_rx_ptype_rrobin = false;
+ common->pf_p0_rx_ptype_rrobin = true;
common->default_vlan = 1;
common->ports = devm_kcalloc(dev, common->port_num,
--
2.43.0
From: Chuck Lever <chuck.lever(a)oracle.com>
Testing shows that the EBUSY error return from mtree_alloc_cyclic()
leaks into user space. The ERRORS section of "man creat(2)" says:
> EBUSY O_EXCL was specified in flags and pathname refers
> to a block device that is in use by the system
> (e.g., it is mounted).
ENOSPC is closer to what applications expect in this situation.
Note that the normal range of simple directory offset values is
2..2^63, so hitting this error is going to be rare to impossible.
Fixes: 6faddda69f62 ("libfs: Add directory operations for stable offsets")
Cc: <stable(a)vger.kernel.org> # v6.9+
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
---
fs/libfs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/libfs.c b/fs/libfs.c
index 748ac5923154..3da58a92f48f 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -292,8 +292,8 @@ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry)
ret = mtree_alloc_cyclic(&octx->mt, &offset, dentry, DIR_OFFSET_MIN,
LONG_MAX, &octx->next_offset, GFP_KERNEL);
- if (ret < 0)
- return ret;
+ if (unlikely(ret < 0))
+ return ret == -EBUSY ? -ENOSPC : ret;
offset_set(dentry, offset);
return 0;
--
2.47.0
Hello,
Below commit was ported to 6.12, but I would like to request porting to the 6.6
longterm branch we are currently using:
commit c809b0d0e52d01c30066367b2952c4c4186b1047
Author: Borislav Petkov (AMD) <bp(a)alien8.de>
Date: 2024-11-19 12:21:33 +0100
x86/microcode/AMD: Flush patch buffer mapping after application
[...]
The patch itself is small, but the consequence of not patching is large on
affected systems (tens of seconds to minutes, of boot delay). See original
discussion [1] for details.
The patch in master relies on a variable 'bsp_cpuid_1_eax' introduced in commit
94838d230a6c ("x86/microcode/AMD: Use the family,model,stepping encoded in the
patch ID"), but porting that entire commit seems excessive, especially because
there are several 'Fixes' commits for that one (e.g. 5343558a868e, d1744a4c975b,
1d81d85d1a19).
I think the simplest prerequisite change is (for Borislav Petkov to confirm):
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index bbd1dc38ea03..555fa76bd1f3 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -96,6 +97,8 @@ struct cont_desc {
static u32 ucode_new_rev;
+static u32 bsp_cpuid_1_eax __ro_after_init;
+
/*
* Microcode patch container file is prepended to the initrd in cpio
* format. See Documentation/arch/x86/microcode.rst
@@ -551,6 +566,7 @@ static void apply_ucode_from_containers(unsigned int cpuid_1_eax)
void load_ucode_amd_early(unsigned int cpuid_1_eax)
{
+ bsp_cpuid_1_eax = cpuid_1_eax;
return apply_ucode_from_containers(cpuid_1_eax);
}
Thanks,
Thomas
[1] https://lore.kernel.org/lkml/ZyulbYuvrkshfsd2@antipodes/T/