From: Qu Wenruo wqu@suse.com
commit 979b25c300dbcbcb750e88715018e04e854de6c6 upstream.
[BUG] For compressed extents, defrag ioctl will always try to defrag any compressed extents, wasting not only IO but also CPU time to compress/decompress:
mkfs.btrfs -f $DEV mount -o compress $DEV $MNT xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar sync xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar sync echo "=== before ===" xfs_io -c "fiemap -v" $MNT/foobar btrfs filesystem defrag $MNT/foobar sync echo "=== after ===" xfs_io -c "fiemap -v" $MNT/foobar
Then it shows the 2 128K extents just get COW for no extra benefit, with extra IO/CPU spent:
=== before === /mnt/btrfs/file1: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..255]: 26624..26879 256 0x8 1: [256..511]: 26632..26887 256 0x9 === after === /mnt/btrfs/file1: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..255]: 26640..26895 256 0x8 1: [256..511]: 26648..26903 256 0x9
This affects not only v5.16 (after the defrag rework), but also v5.15 (before the defrag rework).
[CAUSE]
From the very beginning, btrfs defrag never checks if one extent is
already at its max capacity (128K for compressed extents, 128M otherwise).
And the default extent size threshold is 256K, which is already beyond the compressed extent max size.
This means, by default btrfs defrag ioctl will mark all compressed extent which is not adjacent to a hole/preallocated range for defrag.
[FIX] Introduce a helper to grab the maximum extent size, and then in defrag_collect_targets() and defrag_check_next_extent(), reject extents which are already at their max capacity.
Reported-by: Filipe Manana fdmanana@suse.com CC: stable@vger.kernel.org # 5.16 Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ioctl.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
--- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1020,6 +1020,13 @@ static struct extent_map *defrag_lookup_ return em; }
+static u32 get_extent_max_capacity(const struct extent_map *em) +{ + if (test_bit(EXTENT_FLAG_COMPRESSED, &em->flags)) + return BTRFS_MAX_COMPRESSED; + return BTRFS_MAX_EXTENT_SIZE; +} + static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em, bool locked) { @@ -1036,6 +1043,12 @@ static bool defrag_check_next_extent(str goto out; if (test_bit(EXTENT_FLAG_PREALLOC, &next->flags)) goto out; + /* + * If the next extent is at its max capacity, defragging current extent + * makes no sense, as the total number of extents won't change. + */ + if (next->len >= get_extent_max_capacity(em)) + goto out; /* Physically adjacent and large enough */ if ((em->block_start + em->block_len == next->block_start) && (em->block_len > SZ_128K && next->block_len > SZ_128K)) @@ -1233,6 +1246,13 @@ static int defrag_collect_targets(struct if (range_len >= extent_thresh) goto next;
+ /* + * Skip extents already at its max capacity, this is mostly for + * compressed extents, which max cap is only 128K. + */ + if (em->len >= get_extent_max_capacity(em)) + goto next; + next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em, locked); if (!next_mergeable) {