From: Li Nan linan122@huawei.com
commit 9c47127a807da3e36ce80f7c83a1134a291fc021 upstream.
Raid checks if pad3 is zero when loading superblock from disk. Arrays created with new features may fail to assemble on old kernels as pad3 is used.
Add module parameter check_new_feature to bypass this check.
Link: https://lore.kernel.org/linux-raid/20251103125757.1405796-5-linan666@huaweic... Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Xiao Ni xni@redhat.com Signed-off-by: Yu Kuai yukuai@fnnas.com --- drivers/md/md.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 4e033c26fdd4..9d9cb7e1e6e8 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -340,6 +340,7 @@ static int start_readonly; */ static bool create_on_open = true; static bool legacy_async_del_gendisk = true; +static bool check_new_feature = true;
/* * We have a system wide 'event count' that is incremented @@ -1752,9 +1753,13 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_ } if (sb->pad0 || sb->pad3[0] || - memcmp(sb->pad3, sb->pad3+1, sizeof(sb->pad3) - sizeof(sb->pad3[1]))) - /* Some padding is non-zero, might be a new feature */ - return -EINVAL; + memcmp(sb->pad3, sb->pad3+1, sizeof(sb->pad3) - sizeof(sb->pad3[1]))) { + pr_warn("Some padding is non-zero on %pg, might be a new feature\n", + rdev->bdev); + if (check_new_feature) + return -EINVAL; + pr_warn("check_new_feature is disabled, data corruption possible\n"); + }
rdev->preferred_minor = 0xffff; rdev->data_offset = le64_to_cpu(sb->data_offset); @@ -10459,6 +10464,7 @@ module_param(start_dirty_degraded, int, S_IRUGO|S_IWUSR); module_param_call(new_array, add_named_array, NULL, NULL, S_IWUSR); module_param(create_on_open, bool, S_IRUSR|S_IWUSR); module_param(legacy_async_del_gendisk, bool, 0600); +module_param(check_new_feature, bool, 0600);
MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("MD RAID framework");
On Wed, Dec 17, 2025 at 10:31:30PM +0500, Roman Mamedov wrote:
On Thu, 18 Dec 2025 01:11:43 +0800 "Yu Kuai" yukuai@fnnas.com wrote:
Hi,
在 2025/12/17 22:04, Greg KH 写道:
On Wed, Dec 17, 2025 at 09:05:13PM +0800, linan666@huaweicloud.com wrote:
From: Li Nan linan122@huawei.com
commit 9c47127a807da3e36ce80f7c83a1134a291fc021 upstream.
Raid checks if pad3 is zero when loading superblock from disk. Arrays created with new features may fail to assemble on old kernels as pad3 is used.
Add module parameter check_new_feature to bypass this check.
This is a new feature, why does it need to go to stable kernels?
And a module parameter? Ugh, this isn't the 1990's anymore, this is not good and will be a mess over time (think multiple devices...)
Nan didn't mention the background. We won't backport the new feature to stable kernels(Although this fix a data lost problem in the case array is created with disks in different lbs, anyone is interested can do this). However, this backport is just used to provide a possible solution for user to still assemble arrays after switching to old LTS kernels when they are using the default lbs.
This is still a bad scenario. Original problem:
- Boot into a new kernel once, reboot into the old one, the existing array no longer works.
After this patch:
- Same. Unless you know how, where and which module parameter to add, to be passed to md module on load. Might be not convenient if the root FS didn't assemble and mount and is inaccessible.
Not ideal whatsoever.
Wouldn't it be possible to implement minimal *automatic* recognition (and ignoring) of those newly utilized bits instead?
Yes, that should be done instead.
And again, a module parameter does not work for multiple devices in a system, the upstream change should also be reverted.
thanks,
greg k-h
在 2025/12/18 14:30, Greg KH 写道:
On Wed, Dec 17, 2025 at 10:31:30PM +0500, Roman Mamedov wrote:
On Thu, 18 Dec 2025 01:11:43 +0800 "Yu Kuai" yukuai@fnnas.com wrote:
Hi,
在 2025/12/17 22:04, Greg KH 写道:
On Wed, Dec 17, 2025 at 09:05:13PM +0800, linan666@huaweicloud.com wrote:
From: Li Nan linan122@huawei.com
commit 9c47127a807da3e36ce80f7c83a1134a291fc021 upstream.
Raid checks if pad3 is zero when loading superblock from disk. Arrays created with new features may fail to assemble on old kernels as pad3 is used.
Add module parameter check_new_feature to bypass this check.
This is a new feature, why does it need to go to stable kernels?
And a module parameter? Ugh, this isn't the 1990's anymore, this is not good and will be a mess over time (think multiple devices...)
Nan didn't mention the background. We won't backport the new feature to stable kernels(Although this fix a data lost problem in the case array is created with disks in different lbs, anyone is interested can do this). However, this backport is just used to provide a possible solution for user to still assemble arrays after switching to old LTS kernels when they are using the default lbs.
This is still a bad scenario. Original problem:
- Boot into a new kernel once, reboot into the old one, the existing array no longer works.
After this patch:
- Same. Unless you know how, where and which module parameter to add, to be passed to md module on load. Might be not convenient if the root FS didn't assemble and mount and is inaccessible.
Not ideal whatsoever.
Wouldn't it be possible to implement minimal *automatic* recognition (and ignoring) of those newly utilized bits instead?
Yes, that should be done instead.
And again, a module parameter does not work for multiple devices in a system, the upstream change should also be reverted.
thanks,
greg k-h
.
We propose the following fix for this issue. After fix, md arrays created on old kernels won't be affected by this feature.
https://lore.kernel.org/linux-raid/825e532d-d1e1-44bb-5581-692b7c091796@huaw...
The method is:
only set lbs by default for new array, for assembling the array still left the lbs field unset, in this case the data loss problem is not fixed, we should also print a warning and guide users to set lbs to fix the
problem,
with the notification the array will not be assembled in old kernels.
On Thu, Dec 18, 2025 at 02:57:23PM +0800, Li Nan wrote:
在 2025/12/18 14:30, Greg KH 写道:
On Wed, Dec 17, 2025 at 10:31:30PM +0500, Roman Mamedov wrote:
On Thu, 18 Dec 2025 01:11:43 +0800 "Yu Kuai" yukuai@fnnas.com wrote:
Hi,
在 2025/12/17 22:04, Greg KH 写道:
On Wed, Dec 17, 2025 at 09:05:13PM +0800, linan666@huaweicloud.com wrote:
From: Li Nan linan122@huawei.com
commit 9c47127a807da3e36ce80f7c83a1134a291fc021 upstream.
Raid checks if pad3 is zero when loading superblock from disk. Arrays created with new features may fail to assemble on old kernels as pad3 is used.
Add module parameter check_new_feature to bypass this check.
This is a new feature, why does it need to go to stable kernels?
And a module parameter? Ugh, this isn't the 1990's anymore, this is not good and will be a mess over time (think multiple devices...)
Nan didn't mention the background. We won't backport the new feature to stable kernels(Although this fix a data lost problem in the case array is created with disks in different lbs, anyone is interested can do this). However, this backport is just used to provide a possible solution for user to still assemble arrays after switching to old LTS kernels when they are using the default lbs.
This is still a bad scenario. Original problem:
- Boot into a new kernel once, reboot into the old one, the existing array no longer works.
After this patch:
- Same. Unless you know how, where and which module parameter to add, to be passed to md module on load. Might be not convenient if the root FS didn't assemble and mount and is inaccessible.
Not ideal whatsoever.
Wouldn't it be possible to implement minimal *automatic* recognition (and ignoring) of those newly utilized bits instead?
Yes, that should be done instead.
And again, a module parameter does not work for multiple devices in a system, the upstream change should also be reverted.
thanks,
greg k-h
.
We propose the following fix for this issue. After fix, md arrays created on old kernels won't be affected by this feature.
https://lore.kernel.org/linux-raid/825e532d-d1e1-44bb-5581-692b7c091796@huaw...
The method is:
only set lbs by default for new array, for assembling the array still left the lbs field unset, in this case the data loss problem is not fixed, we should also print a warning and guide users to set lbs to fix the
problem,
with the notification the array will not be assembled in old kernels.
Great, have a patch for this?
thanks,
greg k-h
在 2025/12/18 15:14, Greg KH 写道:
On Thu, Dec 18, 2025 at 02:57:23PM +0800, Li Nan wrote:
在 2025/12/18 14:30, Greg KH 写道:
On Wed, Dec 17, 2025 at 10:31:30PM +0500, Roman Mamedov wrote:
On Thu, 18 Dec 2025 01:11:43 +0800 "Yu Kuai" yukuai@fnnas.com wrote:
Hi,
在 2025/12/17 22:04, Greg KH 写道:
On Wed, Dec 17, 2025 at 09:05:13PM +0800, linan666@huaweicloud.com wrote: > From: Li Nan linan122@huawei.com > > commit 9c47127a807da3e36ce80f7c83a1134a291fc021 upstream. > > Raid checks if pad3 is zero when loading superblock from disk. Arrays > created with new features may fail to assemble on old kernels as pad3 > is used. > > Add module parameter check_new_feature to bypass this check. This is a new feature, why does it need to go to stable kernels?
And a module parameter? Ugh, this isn't the 1990's anymore, this is not good and will be a mess over time (think multiple devices...)
Nan didn't mention the background. We won't backport the new feature to stable kernels(Although this fix a data lost problem in the case array is created with disks in different lbs, anyone is interested can do this). However, this backport is just used to provide a possible solution for user to still assemble arrays after switching to old LTS kernels when they are using the default lbs.
This is still a bad scenario. Original problem:
- Boot into a new kernel once, reboot into the old one, the existing array no longer works.
After this patch:
- Same. Unless you know how, where and which module parameter to add, to be passed to md module on load. Might be not convenient if the root FS didn't assemble and mount and is inaccessible.
Not ideal whatsoever.
Wouldn't it be possible to implement minimal *automatic* recognition (and ignoring) of those newly utilized bits instead?
Yes, that should be done instead.
And again, a module parameter does not work for multiple devices in a system, the upstream change should also be reverted.
thanks,
greg k-h
.
We propose the following fix for this issue. After fix, md arrays created on old kernels won't be affected by this feature.
https://lore.kernel.org/linux-raid/825e532d-d1e1-44bb-5581-692b7c091796@huaw...
The method is:
only set lbs by default for new array, for assembling the array still left the lbs field unset, in this case the data loss problem is not fixed, we should also print a warning and guide users to set lbs to fix the
problem,
with the notification the array will not be assembled in old kernels.
Great, have a patch for this?
thanks,
greg k-h
.
I'm finalizing and testing the patch now and will send it out shortly.
Sorry for any inconvenience caused.
On Wed, Dec 17, 2025 at 09:05:13PM +0800, linan666@huaweicloud.com wrote:
From: Li Nan linan122@huawei.com
commit 9c47127a807da3e36ce80f7c83a1134a291fc021 upstream.
Raid checks if pad3 is zero when loading superblock from disk. Arrays created with new features may fail to assemble on old kernels as pad3 is used.
Add module parameter check_new_feature to bypass this check.
This is a new feature, why does it need to go to stable kernels?
And a module parameter? Ugh, this isn't the 1990's anymore, this is not good and will be a mess over time (think multiple devices...)
thanks,
greg k-h
Hi,
在 2025/12/17 22:04, Greg KH 写道:
On Wed, Dec 17, 2025 at 09:05:13PM +0800, linan666@huaweicloud.com wrote:
From: Li Nan linan122@huawei.com
commit 9c47127a807da3e36ce80f7c83a1134a291fc021 upstream.
Raid checks if pad3 is zero when loading superblock from disk. Arrays created with new features may fail to assemble on old kernels as pad3 is used.
Add module parameter check_new_feature to bypass this check.
This is a new feature, why does it need to go to stable kernels?
And a module parameter? Ugh, this isn't the 1990's anymore, this is not good and will be a mess over time (think multiple devices...)
Nan didn't mention the background. We won't backport the new feature to stable kernels(Although this fix a data lost problem in the case array is created with disks in different lbs, anyone is interested can do this). However, this backport is just used to provide a possible solution for user to still assemble arrays after switching to old LTS kernels when they are using the default lbs.
I think this is fine to provide forward compatibility, please let us know if you do not like this, or Nan should send a new version with explanation.
thanks,
greg k-h
On Thu, 18 Dec 2025 01:11:43 +0800 "Yu Kuai" yukuai@fnnas.com wrote:
Hi,
在 2025/12/17 22:04, Greg KH 写道:
On Wed, Dec 17, 2025 at 09:05:13PM +0800, linan666@huaweicloud.com wrote:
From: Li Nan linan122@huawei.com
commit 9c47127a807da3e36ce80f7c83a1134a291fc021 upstream.
Raid checks if pad3 is zero when loading superblock from disk. Arrays created with new features may fail to assemble on old kernels as pad3 is used.
Add module parameter check_new_feature to bypass this check.
This is a new feature, why does it need to go to stable kernels?
And a module parameter? Ugh, this isn't the 1990's anymore, this is not good and will be a mess over time (think multiple devices...)
Nan didn't mention the background. We won't backport the new feature to stable kernels(Although this fix a data lost problem in the case array is created with disks in different lbs, anyone is interested can do this). However, this backport is just used to provide a possible solution for user to still assemble arrays after switching to old LTS kernels when they are using the default lbs.
This is still a bad scenario. Original problem:
- Boot into a new kernel once, reboot into the old one, the existing array no longer works.
After this patch:
- Same. Unless you know how, where and which module parameter to add, to be passed to md module on load. Might be not convenient if the root FS didn't assemble and mount and is inaccessible.
Not ideal whatsoever.
Wouldn't it be possible to implement minimal *automatic* recognition (and ignoring) of those newly utilized bits instead?
linux-stable-mirror@lists.linaro.org