[BUG] With block-group-tree feature enabled, mounting it with clear_cache would cause the following transaction abort at mount or remount:
BTRFS info (device dm-4): force clearing of disk cache BTRFS info (device dm-4): using free space tree BTRFS info (device dm-4): auto enabling async discard BTRFS info (device dm-4): clearing free space tree BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1) BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2) BTRFS error (device dm-4): block-group-tree feature requires fres-space-tree and no-holes BTRFS error (device dm-4): super block corruption detected before writing it to disk BTRFS: error (device dm-4) in write_all_supers:4288: errno=-117 Filesystem corrupted (unexpected superblock corruption detected) BTRFS warning (device dm-4: state E): Skipping commit of aborted transaction.
[CAUSE] For block-group-tree feature, we have an artificial dependency on free-space-tree.
This means if we detects block-group-tree without v2 cache, we consider it a corruption and cause the problem.
For clear_cache mount option, it would temporary disable v2 cache, then re-enable it.
But unfortunately for that temporary v2 cache disabled status, we refuse to write a superblock with bg tree only flag, thus leads to the above transaction abortion.
[FIX] For now, just reject clear_cache and v1 cache mount option for block group tree. So now we got a graceful rejection other than a transaction abort:
BTRFS info (device dm-4): force clearing of disk cache BTRFS error (device dm-4): cannot disable free space tree with block-group-tree feature BTRFS error (device dm-4): open_ctree failed
Cc: stable@vger.kernel.org # 6.1+ Signed-off-by: Qu Wenruo wqu@suse.com --- For the proper fix, we need to change the behavior of clear_cache and v1 cache switch.
For pure clear_cache without switch cache version, we should allow rebuilding v2 cache without fully disable v2 cache. --- fs/btrfs/super.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 581845bc206a..eefae0318d4f 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -826,7 +826,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, !btrfs_test_opt(info, CLEAR_CACHE)) { btrfs_err(info, "cannot disable free space tree"); ret = -EINVAL; - + } + if (btrfs_fs_compat_ro(info, BLOCK_GROUP_TREE) && + (btrfs_test_opt(info, CLEAR_CACHE) || + !btrfs_test_opt(info, FREE_SPACE_TREE))) { + btrfs_err(info, "cannot disable free space tree with block-group-tree feature"); + ret = -EINVAL; } if (!ret) ret = btrfs_check_mountopts_zoned(info);
On Thu, Apr 27, 2023 at 09:45:32AM +0800, Qu Wenruo wrote:
[BUG] With block-group-tree feature enabled, mounting it with clear_cache would cause the following transaction abort at mount or remount:
BTRFS info (device dm-4): force clearing of disk cache BTRFS info (device dm-4): using free space tree BTRFS info (device dm-4): auto enabling async discard BTRFS info (device dm-4): clearing free space tree BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1) BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2) BTRFS error (device dm-4): block-group-tree feature requires fres-space-tree and no-holes BTRFS error (device dm-4): super block corruption detected before writing it to disk BTRFS: error (device dm-4) in write_all_supers:4288: errno=-117 Filesystem corrupted (unexpected superblock corruption detected) BTRFS warning (device dm-4: state E): Skipping commit of aborted transaction.
[CAUSE] For block-group-tree feature, we have an artificial dependency on free-space-tree.
This means if we detects block-group-tree without v2 cache, we consider it a corruption and cause the problem.
For clear_cache mount option, it would temporary disable v2 cache, then re-enable it.
But unfortunately for that temporary v2 cache disabled status, we refuse to write a superblock with bg tree only flag, thus leads to the above transaction abortion.
[FIX] For now, just reject clear_cache and v1 cache mount option for block group tree. So now we got a graceful rejection other than a transaction abort:
BTRFS info (device dm-4): force clearing of disk cache BTRFS error (device dm-4): cannot disable free space tree with block-group-tree feature BTRFS error (device dm-4): open_ctree failed
Cc: stable@vger.kernel.org # 6.1+ Signed-off-by: Qu Wenruo wqu@suse.com
For the proper fix, we need to change the behavior of clear_cache and v1 cache switch.
For pure clear_cache without switch cache version, we should allow rebuilding v2 cache without fully disable v2 cache.
There was an issue to clarify docs about the space caches and it's a mess, once we had v2 the number of states has increased and it's becoming user unfriendly. At least we have the block-group-tree and free-space-tree tied together and this should be the focus of the feature compatibility.
I think it should be acceptable to slightly change the behaviour for some obscure combination v1/v2/clear/bgt and either reject some of them or suggest more than one step how to do it. E.g. first fully convert from v1 to v2 and then to bgt, or allow v2/bgt/clear in one go.
Added to misc-next, thanks.
On 2023/4/28 07:32, David Sterba wrote:
On Thu, Apr 27, 2023 at 09:45:32AM +0800, Qu Wenruo wrote:
[BUG] With block-group-tree feature enabled, mounting it with clear_cache would cause the following transaction abort at mount or remount:
BTRFS info (device dm-4): force clearing of disk cache BTRFS info (device dm-4): using free space tree BTRFS info (device dm-4): auto enabling async discard BTRFS info (device dm-4): clearing free space tree BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1) BTRFS info (device dm-4): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2) BTRFS error (device dm-4): block-group-tree feature requires fres-space-tree and no-holes BTRFS error (device dm-4): super block corruption detected before writing it to disk BTRFS: error (device dm-4) in write_all_supers:4288: errno=-117 Filesystem corrupted (unexpected superblock corruption detected) BTRFS warning (device dm-4: state E): Skipping commit of aborted transaction.
[CAUSE] For block-group-tree feature, we have an artificial dependency on free-space-tree.
This means if we detects block-group-tree without v2 cache, we consider it a corruption and cause the problem.
For clear_cache mount option, it would temporary disable v2 cache, then re-enable it.
But unfortunately for that temporary v2 cache disabled status, we refuse to write a superblock with bg tree only flag, thus leads to the above transaction abortion.
[FIX] For now, just reject clear_cache and v1 cache mount option for block group tree. So now we got a graceful rejection other than a transaction abort:
BTRFS info (device dm-4): force clearing of disk cache BTRFS error (device dm-4): cannot disable free space tree with block-group-tree feature BTRFS error (device dm-4): open_ctree failed
Cc: stable@vger.kernel.org # 6.1+ Signed-off-by: Qu Wenruo wqu@suse.com
For the proper fix, we need to change the behavior of clear_cache and v1 cache switch.
For pure clear_cache without switch cache version, we should allow rebuilding v2 cache without fully disable v2 cache.
There was an issue to clarify docs about the space caches and it's a mess, once we had v2 the number of states has increased and it's becoming user unfriendly. At least we have the block-group-tree and free-space-tree tied together and this should be the focus of the feature compatibility.
I think it should be acceptable to slightly change the behaviour for some obscure combination v1/v2/clear/bgt and either reject some of them or suggest more than one step how to do it. E.g. first fully convert from v1 to v2 and then to bgt, or allow v2/bgt/clear in one go.
On the clear cache behavior, I'm working on making it independent from the current cache version. E.g. on v2 cache, clear_cache would just clear the cache, without (temporarily) falling back to v1.
This should slightly reduce the complexity for the free space cache matrix.
To me, the biggest inconsistency comes from the fact that free space cache version can be too easily modified just by a mount option.
While compat_ro flags normally should be enabled/disabled by btrfstune tools.
Thanks, Qu
Added to misc-next, thanks.
linux-stable-mirror@lists.linaro.org