[BUG] For a 4K sector sized btrfs with v1 cache enabled and only mounted on systems with 4K page size, if it's mounted on subpage (64K page size) systems, it can cause the following warning on v1 space cache:
BTRFS error (device dm-1): csum mismatch on free space cache BTRFS warning (device dm-1): failed to load free space cache for block group 84082688, rebuilding it now
Although not a big deal, as kernel can rebuild it without problem, such warning will bother end users, especially if they want to switch the same btrfs seamlessly between different page sized systems.
[CAUSE] V1 free space cache is still using fixed PAGE_SIZE for various bitmap, like BITS_PER_BITMAP.
Such hard-coded PAGE_SIZE usage will cause various mismatch, from v1 cache size to checksum.
Thus kernel will always reject v1 cache with a different PAGE_SIZE with csum mismatch.
[FIX] Although we should fix v1 cache, it's already going to be marked deprecated soon.
And we have v2 cache based on metadata (which is already fully subpage compatible), and it has almost everything superior than v1 cache.
So just force subpage mount to use v2 cache on mount.
Reported-by: Matt Corallo blnxfsl@bluematt.me CC: stable@vger.kernel.org # 5.15+ Link: https://lore.kernel.org/linux-btrfs/61aa27d1-30fc-c1a9-f0f4-9df544395ec3@blu... Signed-off-by: Qu Wenruo wqu@suse.com --- fs/btrfs/disk-io.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index d456f426924c..34eb6d4b904a 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3675,6 +3675,17 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device if (sectorsize < PAGE_SIZE) { struct btrfs_subpage_info *subpage_info;
+ /* + * V1 space cache has some hardcoded PAGE_SIZE usage, and is + * going to be deprecated. + * + * Force to use v2 cache for subpage case. + */ + btrfs_clear_opt(fs_info->mount_opt, SPACE_CACHE); + btrfs_set_and_info(fs_info, FREE_SPACE_TREE, + "forcing free space tree for sector size %u with page size %lu", + sectorsize, PAGE_SIZE); + btrfs_warn(fs_info, "read-write for sector size %u with page size %lu is experimental", sectorsize, PAGE_SIZE);
On Fri, Apr 01, 2022 at 03:29:37PM +0800, Qu Wenruo wrote:
[BUG] For a 4K sector sized btrfs with v1 cache enabled and only mounted on systems with 4K page size, if it's mounted on subpage (64K page size) systems, it can cause the following warning on v1 space cache:
BTRFS error (device dm-1): csum mismatch on free space cache BTRFS warning (device dm-1): failed to load free space cache for block group 84082688, rebuilding it now
Although not a big deal, as kernel can rebuild it without problem, such warning will bother end users, especially if they want to switch the same btrfs seamlessly between different page sized systems.
[CAUSE] V1 free space cache is still using fixed PAGE_SIZE for various bitmap, like BITS_PER_BITMAP.
Such hard-coded PAGE_SIZE usage will cause various mismatch, from v1 cache size to checksum.
Thus kernel will always reject v1 cache with a different PAGE_SIZE with csum mismatch.
[FIX] Although we should fix v1 cache, it's already going to be marked deprecated soon.
And we have v2 cache based on metadata (which is already fully subpage compatible), and it has almost everything superior than v1 cache.
So just force subpage mount to use v2 cache on mount.
Reported-by: Matt Corallo blnxfsl@bluematt.me CC: stable@vger.kernel.org # 5.15+ Link: https://lore.kernel.org/linux-btrfs/61aa27d1-30fc-c1a9-f0f4-9df544395ec3@blu... Signed-off-by: Qu Wenruo wqu@suse.com
Reviewed-by: Josef Bacik josef@toxicpanda.com
Thanks,
Josef
On Fri, Apr 01, 2022 at 03:29:37PM +0800, Qu Wenruo wrote:
[BUG] For a 4K sector sized btrfs with v1 cache enabled and only mounted on systems with 4K page size, if it's mounted on subpage (64K page size) systems, it can cause the following warning on v1 space cache:
BTRFS error (device dm-1): csum mismatch on free space cache BTRFS warning (device dm-1): failed to load free space cache for block group 84082688, rebuilding it now
Although not a big deal, as kernel can rebuild it without problem, such warning will bother end users, especially if they want to switch the same btrfs seamlessly between different page sized systems.
[CAUSE] V1 free space cache is still using fixed PAGE_SIZE for various bitmap, like BITS_PER_BITMAP.
Such hard-coded PAGE_SIZE usage will cause various mismatch, from v1 cache size to checksum.
Thus kernel will always reject v1 cache with a different PAGE_SIZE with csum mismatch.
[FIX] Although we should fix v1 cache, it's already going to be marked deprecated soon.
And we have v2 cache based on metadata (which is already fully subpage compatible), and it has almost everything superior than v1 cache.
So just force subpage mount to use v2 cache on mount.
Reported-by: Matt Corallo blnxfsl@bluematt.me CC: stable@vger.kernel.org # 5.15+ Link: https://lore.kernel.org/linux-btrfs/61aa27d1-30fc-c1a9-f0f4-9df544395ec3@blu... Signed-off-by: Qu Wenruo wqu@suse.com
fs/btrfs/disk-io.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index d456f426924c..34eb6d4b904a 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3675,6 +3675,17 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device if (sectorsize < PAGE_SIZE) { struct btrfs_subpage_info *subpage_info;
/*
* V1 space cache has some hardcoded PAGE_SIZE usage, and is
* going to be deprecated.
*
* Force to use v2 cache for subpage case.
*/
btrfs_clear_opt(fs_info->mount_opt, SPACE_CACHE);
btrfs_set_and_info(fs_info, FREE_SPACE_TREE,
"forcing free space tree for sector size %u with page size %lu",
sectorsize, PAGE_SIZE);
I'm not sure this is implemented the right way. Why is it unconditional? Does any subsequent mount have to clear and set the bits after it has been already? Or what if the free space tree is set at mkfs time, which is now the default.
Next, remounting v1->v2 does more things, like removing the v1 tree if it exists. And due to some bugs there are more bits for free space tree to be set like FREE_SPACE_TREE_VALID. So I don't thing this patch covers all cases for the v2.
On 2022/4/2 00:39, David Sterba wrote:
On Fri, Apr 01, 2022 at 03:29:37PM +0800, Qu Wenruo wrote:
[BUG] For a 4K sector sized btrfs with v1 cache enabled and only mounted on systems with 4K page size, if it's mounted on subpage (64K page size) systems, it can cause the following warning on v1 space cache:
BTRFS error (device dm-1): csum mismatch on free space cache BTRFS warning (device dm-1): failed to load free space cache for block group 84082688, rebuilding it now
Although not a big deal, as kernel can rebuild it without problem, such warning will bother end users, especially if they want to switch the same btrfs seamlessly between different page sized systems.
[CAUSE] V1 free space cache is still using fixed PAGE_SIZE for various bitmap, like BITS_PER_BITMAP.
Such hard-coded PAGE_SIZE usage will cause various mismatch, from v1 cache size to checksum.
Thus kernel will always reject v1 cache with a different PAGE_SIZE with csum mismatch.
[FIX] Although we should fix v1 cache, it's already going to be marked deprecated soon.
And we have v2 cache based on metadata (which is already fully subpage compatible), and it has almost everything superior than v1 cache.
So just force subpage mount to use v2 cache on mount.
Reported-by: Matt Corallo blnxfsl@bluematt.me CC: stable@vger.kernel.org # 5.15+ Link: https://lore.kernel.org/linux-btrfs/61aa27d1-30fc-c1a9-f0f4-9df544395ec3@blu... Signed-off-by: Qu Wenruo wqu@suse.com
fs/btrfs/disk-io.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index d456f426924c..34eb6d4b904a 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3675,6 +3675,17 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device if (sectorsize < PAGE_SIZE) { struct btrfs_subpage_info *subpage_info;
/*
* V1 space cache has some hardcoded PAGE_SIZE usage, and is
* going to be deprecated.
*
* Force to use v2 cache for subpage case.
*/
btrfs_clear_opt(fs_info->mount_opt, SPACE_CACHE);
btrfs_set_and_info(fs_info, FREE_SPACE_TREE,
"forcing free space tree for sector size %u with page size %lu",
sectorsize, PAGE_SIZE);
I'm not sure this is implemented the right way. Why is it unconditional?
Isn't the same thing we do when parsing the mount options for switch v1->v2 cache?
Does any subsequent mount have to clear and set the bits after it has been already? Or what if the free space tree is set at mkfs time, which is now the default.
The function btrfs_set_and_info() will only inform the end users if the bit is not set.
Next, remounting v1->v2 does more things, like removing the v1 tree if it exists. And due to some bugs there are more bits for free space tree to be set like FREE_SPACE_TREE_VALID. So I don't thing this patch covers all cases for the v2.
You're right on remounting, but in the opposite way. There is nothing prevent the users from re-enabling v1 cache.
I should also prevent user from setting v1 cache.
Another concern is, I didn't see code cleaning up v1 cache when we do the v1->v2 switch. The only code doing such cleanup is cleanup_free_space_cache_v1() in free-space-cache.c, but it only gets called in btrfs_set_free_space_cache_v1_active().
Or did I miss something?
Thanks, Qu
On Sat, Apr 02, 2022 at 06:57:58AM +0800, Qu Wenruo wrote:
index d456f426924c..34eb6d4b904a 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3675,6 +3675,17 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device if (sectorsize < PAGE_SIZE) { struct btrfs_subpage_info *subpage_info;
/*
* V1 space cache has some hardcoded PAGE_SIZE usage, and is
* going to be deprecated.
*
* Force to use v2 cache for subpage case.
*/
btrfs_clear_opt(fs_info->mount_opt, SPACE_CACHE);
btrfs_set_and_info(fs_info, FREE_SPACE_TREE,
"forcing free space tree for sector size %u with page size %lu",
sectorsize, PAGE_SIZE);
I'm not sure this is implemented the right way. Why is it unconditional?
Isn't the same thing we do when parsing the mount options for switch v1->v2 cache?
The mount code checks if the v1 is active ie. has some existing structures and then clears v1.
Does any subsequent mount have to clear and set the bits after it has been already? Or what if the free space tree is set at mkfs time, which is now the default.
The function btrfs_set_and_info() will only inform the end users if the bit is not set.
Next, remounting v1->v2 does more things, like removing the v1 tree if it exists. And due to some bugs there are more bits for free space tree to be set like FREE_SPACE_TREE_VALID. So I don't thing this patch covers all cases for the v2.
You're right on remounting, but in the opposite way. There is nothing prevent the users from re-enabling v1 cache.
I should also prevent user from setting v1 cache.
Good point, for the subpage case yes, but otherwise not though it's a valid operation it does not make much sense.
Another concern is, I didn't see code cleaning up v1 cache when we do the v1->v2 switch. The only code doing such cleanup is cleanup_free_space_cache_v1() in free-space-cache.c, but it only gets called in btrfs_set_free_space_cache_v1_active().
Or did I miss something?
No, that's the code I had in mind, and clearing the v1 code once v2 is activated has been added because it was not there before.
On Fri, Apr 01, 2022 at 03:29:37PM +0800, Qu Wenruo wrote:
[BUG] For a 4K sector sized btrfs with v1 cache enabled and only mounted on systems with 4K page size, if it's mounted on subpage (64K page size) systems, it can cause the following warning on v1 space cache:
BTRFS error (device dm-1): csum mismatch on free space cache BTRFS warning (device dm-1): failed to load free space cache for block group 84082688, rebuilding it now
Although not a big deal, as kernel can rebuild it without problem, such warning will bother end users, especially if they want to switch the same btrfs seamlessly between different page sized systems.
[CAUSE] V1 free space cache is still using fixed PAGE_SIZE for various bitmap, like BITS_PER_BITMAP.
Such hard-coded PAGE_SIZE usage will cause various mismatch, from v1 cache size to checksum.
Thus kernel will always reject v1 cache with a different PAGE_SIZE with csum mismatch.
[FIX] Although we should fix v1 cache, it's already going to be marked deprecated soon.
And we have v2 cache based on metadata (which is already fully subpage compatible), and it has almost everything superior than v1 cache.
So just force subpage mount to use v2 cache on mount.
Reported-by: Matt Corallo blnxfsl@bluematt.me CC: stable@vger.kernel.org # 5.15+ Link: https://lore.kernel.org/linux-btrfs/61aa27d1-30fc-c1a9-f0f4-9df544395ec3@blu... Signed-off-by: Qu Wenruo wqu@suse.com
Added to misc-next, thanks.
linux-stable-mirror@lists.linaro.org